o11c

joined 2 years ago
[–] o11c@programming.dev 1 points 2 years ago (7 children)

In practice, Protocols are a way to make "superclasses" that you can never add features to (for example, readinto despite being critical for performance is utterly broken in Python). This should normally be avoided at almost all costs, but for some reason people hate real base classes?

If you really want to do something like the original article, where there's a C-implemented class that you can't change, you're best off using a (named) Union of two similar types, not a Protocol.

I suppose they are useful for operator overloading but that's about it. But I'm not sure if type checkers actually implement that properly anyway; overloading is really nasty in a dynamically-typed language.

[–] o11c@programming.dev 2 points 2 years ago

All of these can be done with raw strings just fine.

For the first pathlib bug case, PATH-like lookup is common, not just for binaries but also data and conf files. If users explicitly request ./foo they will be very upset if your program instead looks at /defaultpath/foo. Also, God forbid you dare pass a Path("./--help") to some program. If you're using os.path.dirname this works just fine.

For the second pathlib bug case, dir/ is often written so that you'll cause explicit errors if there's a file by that name. Also there are programs like rsync where the trailing slash outright changes the meaning of the command. Again, os.path APIs give you the correct result.

For the article mistake, backslash is a perfectly legal character in non-Windows filenames and should not be treated as a directory component separator. Thankfully, pathlib doesn't make this mistake at least. OTOH, / is reasonable to treat as a directory component separator on Windows (and some native APIs already handle it, though normalization is always a problem).

I also just found that the pathlib.Path constructor ignores extra kwargs. But Python has never bothered much with safety anyway, and this minor compared to the outright bugs the other issues cause.

[–] o11c@programming.dev 1 points 2 years ago

One problem is that Rust doesn't support tagged unions. enum is regrettably solving a different problem, but since it's the only hammer we have, it's abused for this kind of thing. This often leads to having to write match error ... unreachable.

[–] o11c@programming.dev 1 points 2 years ago

The default handling is pretty important.

What I find more interesting are 1. the two-argument form of iter, and 2. the __getitem__ auto-implementation that causes there to be two incompatible definitions of Iterable.

(btw your comments are using accidental formatting; use backticks: __next__)

[–] o11c@programming.dev 3 points 2 years ago (2 children)

The problem with pathlib is that it normalizes away critical information so can't be used in many situations.

./path should not be path should not be path/.

Also the article is wrong about "Path('some\\path') becomes some/path on Linux/Mac."

[–] o11c@programming.dev 1 points 2 years ago* (last edited 2 years ago)

I've done something similar. In my case it was a startup script that did something like the following:

  • poll github using the search API for PR labels (note that this has sometimes stopped returning correct results, but ...).
    • always do this once at startup
    • you might do this based on notifications; I didn't bother since I didn't need rapid responsiveness. Note that you should not do this for the specific data from a notification though; it's only a way to wake up the script.
    • but no matter what, you should do this after N minutes, since notifications can be lost.
  • perform a git fetch for your main development branch (the one you perform the real merges to) and all pull/ refs (git does not do this by default; you'll have to set them up for your local test repo. Note that you want to refer to the unmerged commits for these)
  • if the set of commits for all tagged PRs has not changed, wait and poll again
  • reset the test repo to the most recent commit from your main development branch
  • iterate over all PRs with the appropriate label:
    • ordering notes:
      • if there are commits that have previously tested successfully, you might do them first. But still test again since the merge order could be different. This of course depends on the level of tests you're doing.
      • if you have PRs that depend on other PRs, do them in an appropriate order (perhaps the following will suffice, or maybe you'll have some way of detecting this). As a rule we soft-forbid this though; such PRs should have been merged early.
      • finally, ordering by PR number is probably better than ordering by last commit date
    • attempt the merge (or rebase). If a nop, log that somewhere. If not clean, skip the PR for now (and log that), but only mark this as an error if it was the first PR you've merged (since if there's a conflict it could be a prior PR's fault).
    • Run pre-build stuff that might need to create further commits, build the product, and run some quick tests. If they fail, rollback the repo to the previous merge and complain.
    • Mark the commit as apparently good. Note that this is specifically applying to commits not PRs or branch names; I admit I've been sloppy above.
  • perform a pre-build, build and quick test again (since we may have rolled back and have a dirty build - in fact, we might not have ended up merging anything!)
  • if you have expensive tests, run them only here (and treat this as "unexpected early exit" below). It's presumed that separate parts of your codebase aren't too crazily entangled, so if a particular test fails it should be "obvious" which PR is relevant. Keep in mind that I used this system for assumed viable-work-in-progress PRs.
  • kill any existing instance and launch a new instance of the product using the build from the final merged commit and begin accepting real traffic from devs and beta users.
  • users connecting to the instance should see the log
  • if the launched instance exits unexpectedly within M minutes AND we actually ended up merging anything into the known-good branch, then reset to the main development branch (and build etc.) so that people at least have a functioning test server, but complain loudly in the MOTD when they connect to it. The condition here means that if it exits suddenly again the whole script goes up and starts again, which may be necessary if someone intentionally tried to kill the server to force a new merge sequence but it was too soon.
    • alternatively you could try bisecting the set of PR commits or something, but I never bothered. Note that you probably can't use git bisect for this since you explicitly do not want to try commit from the middle of a PR. It might be simpler to whitelist or blacklist one commit at a time, but if you're failing here remember that all tests are unreliable.
[–] o11c@programming.dev 1 points 2 years ago

Honestly you probably should think about how to translate them. Python at least rolls its own .mo parser so it can support multiple languages in a single process; it's much more difficult in C unless you push it to the clients (which requires pushing the parameterization as well).

Non-.pot-based internationalization formats are almost always braindead and should be avoided.

[–] o11c@programming.dev 1 points 2 years ago

Note that by messing with a particular module's __path__ you can turn it into a "package" that loads from arbitrary directories.

[–] o11c@programming.dev 3 points 2 years ago* (last edited 2 years ago)

No. Duck types (including virtual subclasses) considered harmful; use real inheritance if your language doesn't provide anything strictly better.

It is incomparably convenient to be able to retroactively add "default implementations" to interface functions (consider for example how broken readinto is in Python). Some statically-typed languages let you do that without inheritance, but no dynamically-typed language can.

This reads more as a rant against inheritance (without any explanation whatsoever) than a legitimate argument.

[–] o11c@programming.dev 1 points 2 years ago (1 children)

I likewise don't really use Godot, but for graphics in general, the 4th coordinate is important, even if it is "usually" 1. It's most obvious to correctly interpolate near the poles of a sphere with a single rectangular texture, but think for a minute what "near" means.

Back to the main point though: the important things we normally rely on for matrix math are associativity (particularly, for exponentiation!) and anticommutativity (beware definitions that are sloppy about "inverse").

[–] o11c@programming.dev 0 points 2 years ago

You should have part of your test harness perform a separate import of every module. If your module is idempotent (most good code is) you could do this in a single process by cleaning sys.modules I guess ... but it still won't be part of your pytest process.

Static analyzers can only detect some cases, so can't be fully trusted.

I've also found there are a lot of cases where performant Python code has to be implemented in a distinct way from what the type-checker sees. You can do this with aggressive type: ignore but I often find it cleaner to use separate if blocks.

[–] o11c@programming.dev 3 points 2 years ago (1 children)
from __future__ import annotations
view more: ‹ prev next ›