Bug reports on any software

116 readers
2 users here now

When a bug tracker is inside the exclusive walled-gardens of MS Github or Gitlab.com, and you cannot or will not enter, where do you file your bug report? Here, of course. This is a refuge where you can report bugs that are otherwise unreportable due to technical or ethical constraints.

⚠of course there are no guarantees it will be seen by anyone relevant. Hopefully some kind souls will volunteer to proxy the reports.

founded 3 years ago
MODERATORS
1
 
 

The linked article showcases a disaster of the text previewer in the stock Lemmy client. It makes sense that linefeeds would be stripped to some extent, but when the content relies on a linebreak for every line because it’s important for formatting, it’s a disaster when you have half a screen of text.

The fix: the preview code should count the number of linefeeds it removes. If it removes more than ~4 or so linefeeds, it should be clear that it’s not dealing with normal sized paragraphs. In this case, it should only show a few lines (with linefeeds) and have a spoiler or expansion option.

Another simpler fix: have a “suppress preview” tickbox so an author can manually clear a bad preview box.

2
 
 

An original poster asks a question or attempts to create a thread to compile information about a topic, and there is always some clown or asshole who cannot resist posting a snide remark. If the snide remark is clever or captures the sentiment of many, it gets a flood of up votes and rises to the top, bringing with it a tree of replies to the snide remark. Useful constructive answers get buried because they are boring to the wider audience who just likes to see a good roasting. I think there are more kids in the threadiverse than we expect.

So content that’s nearly garbage dominates the thread and drowns out the thread’s purpose, disservicing the OP and all those who want the same answer or collaboration. It’s a design failure of Lemmy to be blind to this very basic characteristic of human nature.

Censorship is unreasonable in this situation. But so is the status quo. Nothing wrong with a bunch of clowns having fun, but that fun should happen non-disruptively on the sidelines and out of the way. The OP has a mission and purpose. The OP should be able to click a red fish that flags a post as a red herring. From there, that tree should be pushed out of the way somehow.. to a sidebar or folded, or a subthread of sorts.. call it the clown room. Critics who just want to bitch or push contempt should still have a voice. Make it so they have to click a “criticism” button to then step into a space with unwanted criticism.

There is wanted criticism and unwanted criticism. An OP might say “Roast me..” or “what’s wrong with this approach?” If the OP intends for the discussion to be controversial, then the OP obviously has no interest in the flagging anything. But if the OP has a mission to accomplish, they should have a control.

Another way to look at this is the fedi could use a stackexchange replacement. Stackexchange never has garbage getting high ranks. I’ve never had an acct there so I don’t know how they manage it, but it seems Lemmy could learn from that.

3
 
 

Every Lemmy instance chooses its own name for the meta community. Some don’t even have one. Some choose quite bizarre names.

That’s shit. If you walk into an office building, the receptionist is almost always close to the main entrance. When you enter a restaurant, the host(ess) is either close to the front door or there is a clear path to the host(ess). Yet Lemmy is terribly organised in this way. The power of defaults can go a long way here. A meta community should be created by default with a default name. And by default it should be listed at the top on the communities list.

Best way to cope with the madness is sort communities chronologically with oldest first. But it’s not solid. Sometimes the meta community is created late in the game.

4
 
 

Both Lemmy and mbin have a shitty way of treating authors of content that is censored by a moderator.

Lemmy: if your post is removed from a community timeline, you still have the content. In fact, your logged-in profile looks no different, as if the message is still there. It’s quite similar to shadow banning. Slightly better though because if you pay attention or dig around, you can at least discover that you were censored. But shitty nonetheless that you get no notification of the censorship.

Mbin: if your post is removed, you are subjected to data loss. I just wrote a high effort post europe@feddit.org and it was censored for not being “news”. There is no rule that your post must be news, just a subtle mention in the topic of news. In fact they delete posts that are not news, despite not having a rule along those lines. So my article is lost due to this heavy-handed moderation style. Mbin authors are not deceived about the status of their post like on lemmy, but authors suffer from data loss. They do not get a copy of what they wrote so they cannot recover and post it elsewhere.

It’s really disgusting that a moderator’s trigger happy delete button has data loss for someone else as a consequence. I probably spent 30 minutes writing the post only to have that effort thrown away by a couple clicks. Data loss is obviously a significant software defect.

5
 
 

Discuss. (But plz, it’s only interesting to hear from folks who have some healthy degree of contempt for exclusive corporate walled-gardens and the technofeudal system the fedi is designed to escape.)

And note that links can come into existence that are openly universally accessible and then later become part of a walled-garden... and then later be open again. For example, youtube. And a website can become jailed in Cloudflare but then be open again at the flip of a switch. So a good solution would be a toggle of sorts.

6
 
 

When an arrogant presumptuous dick dumps hot-headed uncivil drivel into a relatively apolitical thread about plumbing technology and reduces the quality of the discussion to a Trump vs. $someone style shitshow of threadcrap, the tools given to the moderator are:

  • remove the comment (chainsaw)
  • ban the user from the community (sledge hammer)

Where are the refined sophisticated tools?

When it comes to nannying children, we don’t give teachers a baseball bat. It’s the wrong tool. We are forced into a dilemma: either let the garbage float, or censor. This encourages moderators to be tyrants and too many choose that route. Moderators often censor civil ideas purely because they want to control the narrative (not the quality).

I want to do quality control, not narrative control. I oppose the tyranny of censorship in all but the most vile cases of bullying or spam. The modlog does not give enough transparency. If I wholly remove that asshole’s comment, then I become an asshole too.

He is on-topic. Just poor quality drivel that contributes nothing of value. Normally voting should solve this. X number of down votes causes the comment to be folded out of view, but not censored. It would rightfully keep the comment accessible to people who want to pick through the garbage and expand the low quality posts.

Why voting fails:

  • tiny community means there can never be enough down votes to fold a comment.
  • votes have no meaning. Bob votes emotionally and down votes every idea he dislikes, while Alice down votes off-topic or uncivil comments, regardless of agreement.

Solutions:

I’m not trying to strongly prescribe a fix in particular, but have some ideas to brainstorm:

  • Mods get the option to simply fold a shitty comment when the msg is still on-topic and slightly better quality than spam. This should come with a one-line field (perhaps mandatory) where the mod must rationalise the action (e.g. “folded for uncivil rant with no useful contribution to the technical information sought”).
  • A warning counter. Mods can send a warning to a user in connection with a comment. This is already possible but requires moderators to have an unhuman memory. A warning should not just be like any DM.. it should be tracked and counted. Mods should see a counter next to participants indicating how many warnings they have received and a page to view them all, so as to aid in decisions on whether to ban a user from a community.
  • Moderator votes should be heavier than user votes. Perhaps an ability to choose how many votes they want to cast on a particular comment to have an effect like folding. Of course this should be transparent so it’s clear that X number of votes were cast by a mod. Rationale:
    • mods have better awareness of the purpose and rules of the community
    • mods are stakeholders with more investment into the success of a community than users
  • Moderators could control the weight of other user’s votes. When 6 people upvote an uncivil post and only 2 people down vote it, it renders voting as a tool impotent and in fact harm inducing. Lousy/malicious voters have no consequences for harmful voting and thus no incentive to use voting as an effective tool for good. A curator should be able to adjust voting weight accordingly. E.g. take an action on a particular poll that results in a weight adjustment (positive or negative) on the users who voted a particular direction. The effect would be to cause voters to prioritize civil quality above whether they simply like/dislike an idea, so that votes actually take on a universal meaning. Which of course then makes voting an effective tool for folding poor quality content (as it was originally intended).
  • (edit) Ability for a moderator to remove a voting option. If a comment is uncivil, allowing upvotes is only detrimental. So a moderator should be able to narrow the ballot to either down vote or neutral. And perhaps the contrary as well (like some beehaw is instance-wide). And perhaps the option to neutralise voting on a specific comment.
7
3
submitted 2 months ago* (last edited 2 months ago) by activistPnk@slrpnk.net to c/bugs@sopuli.xyz
 
 

If you open a PDF document in the browser (thus in pdf.js) and click the down arrow (↓) to save it locally, it redownloads the document instead of simply saving it from the cache. If you lose network connectivity or disconnect then try to save the PDF locally for later viewing, the browser reports connection issues when there was no need for the network.

Tor Browser (Firefox based) does not have this problem.

8
9
 
 

An important part of the Youtube content is the transcript at the bottom of the video description. There are some 3rd-party sites that collect and share the YT transcripts separately but then the naive admins put the service in Cloudflare’s walled garden, which is worse than YT itself and purpose-defeating to a large extent. (exceptionally this service is CF-free, but it says “Transcript is disabled on this video” in my test: https://youtubetranscript.io/)

Invidious should be picking up the slack here.

And Lemmy could do better by automatically fetching the transcript of youtube/invidious links and include it, perhaps spoiler style like this.

10
2
submitted 3 months ago* (last edited 3 months ago) by activistPnk@slrpnk.net to c/bugs@sopuli.xyz
 
 

I browse with images disabled. But sometimes I encounter a post where I want to see the image, like this one:

https://iejideks5zu2v3zuthaxu5zz6m5o2j7vmbd24wh6dnuiyl7c6rfkcryd.onion/@JosephMeyer@c.im/112923392848232303

When opening that link in a browser configured to fetch images, it redirects to the original instance, which is inside an access-restricted walled garden. This seems like a new behaviour for Mastodon thus may be a regression.

It’s a terrible design because it needlessly forces people on open decentralised networks into centralised walled gardens. The behaviour arises out of the incorrect assumption that everyone has equal access. As Cloudflare proves, access equality is non-existent. The perversion in this particular case is an onion is redirecting to Cloudflare (an adversary to all those who have onion access).

There should be two separate links to each post: one to the source node, and one to the mirror. This kind of automatic redirect is detrimental. Lemmy demonstrates the better approach of giving two links and not redirecting. (But Lemmy has that problem of not mirroring images).

11
 
 

There are some very slow nodes (like Beehaw) where the server is apparently so overworked it cannot render a login form most of the time. The browser times out waiting. In the rare moments that there is a login opportunity, about ½ the time the login fails with a 2 second popup saying “incorrect login credentials”.

It’s quite terrible because obviously users would assume their account has been deleted


because that’s how most online services work. Admins do not generally give warnings or say why an account is deleted. They just hit the delete button. Like Marvin in Office Space who was not told he was laid off.. they just “fixed the payroll glitch”. This is generally how communication works on communication platforms.. admins just pull the plug.

So because of how people learn that their account is deleted, users cannot distinguish a purposeful account removal from a faulty server. If you have a Beehaw account and you are told “incorrect login credentials”, don’t believe it. Keep trying. Eventually you’ll get in.

12
 
 

In the stock Lemmy web client there is apparently no mechanism for users to fetch their history of posts. The settings page gives only a way to download settings. This contrasts with Mastodon where users can grab an archive of everything they have posted which is still stored on the server.

Or am I missing something?

IIUC, there is no GDPR issue here because no data is personal (because all Lemmy accounts are anonymous). But if a Lemmy server were to hypothetically require users to identify themselves with first+last name, then the admin would have a substantial manual burden to comply with GDPR Art.20 requests. Correct?

13
 
 

These environment variables designate a parameter that holds the value of a HTTP proxy:

  • http_proxy
  • https_proxy
  • HTTP_PROXY
  • HTTPS_PROXY

It’s a convention, but the name “HTTP proxy” can only imply HTTP proxy, not a SOCKS proxy. The golang¹ standard libraries expect the above HTTP proxy parameters to specify a SOCKS proxy. How embarrassing is that? So any Go app that offers a proxy feature replicates getting the proxy kind backwards. Such as hydroxide, which requires passing a SOCKS proxy as a HTTP proxy.

¹ “Go” is such a shitty unsearchable name for a language. It’s no surprise that the developers of the language infra itself struggle with the nuances of natural language. HTTP≠SOCKS. And IIUC, this language is a product of Google. WTF. It’s the kind of amateurish screwup you would expect to come from some teenager’s mom’s basement, not a fortune 500 company among the world’s biggest tech giants.

(edit)
It’s a bit amusing and simultaneously disasappointing that reporting bugs and suggesting enhancements to Google’s language requires using Microsoft’s platform:

https://github.com/golang/proposal#the-proposal-process

FOSS developers: plz avoid Golang - it’s a shit show.

14
 
 

Lingva & Simply Translate are two different front-ends to Google Translate. I’m not running the software myself because I run Argos locally (for privacy), but when Argos gives a really bad translation I resort to Lingva and Simply Translate instances.

I tried to translate a privacy policy. Results:

Lingva instances:

  • translate.plausibility.cloud ← goes to lunch
  • lingva.lunar.icu ← gives “414 Request-URI Too Large”
  • lingva.ml & lingva.garudalinux.org ← fuck off Cloudflare! Obviously foolishly purpose defeating to surreptitiously expose people to CF who are trying to avoid direct Google connections.
  • translate.igna.wtf ← dead
  • translate.dr460nf1r3.org ← dead

Simply Translate instances (list of instances broken for me but found a year-old mirror of that):

  • simplytranslate.org ← just gives a blank
  • st.tokhmi.xyz ← up but results are just CSS garbage
  • translate.bus-hit.me (ST fork mozhi) ← shoots a blank result
  • simplytranslate.pussthecat.org ← redirects to mozhi.pussthecat.org
  • mozhi.pussthecat.org (ST fork mozhi) ← shoots a blank result
  • translate.projectsegfau.lt (ST fork mozhi) ←translates the first word then drops the rest; this instance is incorrectly listed as Lingva
  • translate.northboot.xyz ← up but results are just CSS garbage
  • st.privacydev.net ← up but results are just CSS garbage
  • tl.vern.cc ← up but results are just CSS garbage

~~It looks as if Simply Translate is not keeping up with Google API changes.~~ (edit: actually the CSS garbage is what we get when feeding it bulky input -- those instances work on small input)

graveyard of dead sites:

  • simplytranslate.manerakai.com ← redirects to vacated site
  • translate.josias.dev
  • translate.riverside.rocks
  • translate.tiekoetter.com
  • simplytranslate.esmailelbob.xyz
  • translate.slipfox.xyz
  • translate.priv.pw
  • st.odyssey346.dev
  • fyng2tsmzmvxmojzbbwmfnsn2lrcyftf4cw6rk5j2v2huliazud3fjid.onion
  • xxtbwyb5z5bdvy2f6l2yquu5qilgkjeewno4qfknvb3lkg3nmoklitid.onion
  • translate.prnoid54e44a4bduq5due64jkk7wcnkxcp5kv3juncm7veptjcqudgyd.onion
  • simplytranslate.esmail5pdn24shtvieloeedh7ehz3nrwcdivnfhfcedl7gf4kwddhkqd.onion
  • tl.vernccvbvyi5qhfzyqengccj7lkove6bjot2xhh5kajhwvidqafczrad.onion
  • st.g4c3eya4clenolymqbpgwz3q3tawoxw56yhzk4vugqrl6dtu3ejvhjid.onion

Why this is a bug


Frond-ends and proxies exist to circumvent the anti-features of the service they are facilitating access to. So if there is a volume limitation, the front-end should be smart enough to split the content into pieces, translate the pieces separately, and reassemble. In fact that should be done anyway for privacy, to disassociate pieces of text from each other.

Alternatively (and probably better), would be to have a front-end for the front-ends. Something that gives a different paragraph to several different Lingva/ST instances and reassembles the results. This would (perhaps?) link a different IP to each piece assuming the front-ends also proxy (not sure if that’s the case).

15
 
 

cross-posted from: https://slrpnk.net/post/11375008

Whoever designed the OSM db either never uses ATM machines or they have never experienced anything like the ATM disaster in Netherlands. The OSM db has most ATM brands incorrect for Netherlands and seriously needs more fields so travelers can actually find a functioning ATM.

brands are mostly incorrect

Pick any Dutch city. Search » Categories » custom search » Finance » ATM. The brands are mostly misinfo. These ATM brands do not exist anywhere in Netherlands:

  • Rabobank
  • ABN AMRO
  • Ing
  • SNS

All those banks removed all their ATM machines and joined a monopolistic consortium called “Geldmaat”. There is generally an ATM at those locations but it’s always a Geldmaat ATM. So a simple find and replace is needed on all the Dutch maps.

For indoor ATMs, the brand is often incorrectly named after the shop it’s in. That’s useful for finding it but still missing important info: the actual ATM brand. ATM brand is very important because different ATM brands give differing degrees of shitty treatment. If brand X refuses your card, all instances of that ATM brand will likely refuse your card. So the “brand” field should always reflect the ATM operator. Having a separate shop name field would be useful for locating the machine.

missing key attributes

Travelers should not have to spend hours running from one ATM to another until they find one that works. There are lots of basic variables that need to be accounted for in the db:

  • (real or fixed point) ATM fee
  • (enum set) currencies other than local (a rare but very useful option is to e.g. pull out GBP or USD in the eurozone)
  • (enum set) card networks supported (visa, amex, discover, maestro, etc)
  • (enum set) UI languages supported
  • (integer) transaction limit for domestic cards
  • (integer) transaction limit for foreign cards
  • (integer set) denominations in the machine (Netherlands quietly removed all banknotes >€50 from all ATMs IIUC)
  • (boolean) whether customers can control the denominations
  • (boolean) indoor/outdoor (if the txn limit field is empty, indoor machines often have higher limits)
    • (string) hours of operation (if indoor)
    • (string) name of shop the ATM is inside (if indoor)
  • (enum) whether a balance check is supported: [no | only some cards | any card]; this feature is non-existent in Belgium but common in Netherlands. Note that some ATMs only give balance on their own cards.
    • (enum) whether the balance is on screen or printed to the receipt, or both
  • (boolean) insertion style -- whether the card is sucked into the machine (this is very important because if the card is sucked in by a motor there is a real risk that the machine keeps the card [yes, that’s deliberate]). Motorised insertion is more reliable but carries the risk of confiscation. Manual insertion can be fussy and take many tries to get it to read the card but you never have to worry about confiscation.
  • (boolean) dynamic currency conversion (DCC)
  • (boolean) whether there is an earphone port for blind people (not sure if that’s always there)
16
 
 

In the Lemmy web client it used to be possible to open a new tab (control-tab) which would naturally be logged in. That goes for most websites. With Lemmy it started getting flakey (sometimes works, sometimes not). Lately it’s working less often and it seems browser flavor is a factor. Tor Browser (FF) generally works, but Ungoogled Chromium new tabs are logged out. So in UC, I have to do everything for a Lemmy instance under one tab.

I wonder what kind of funny business causes session cookies to fail. My guess is they are not using session cookies for logins but rather one of the rare alternatives.

update


With just one tab running, I did a hard refresh (control-shift-R). That logged me out presumably doing the same as getting a new tab. Using the /back/ button does not recover from this.

17
2
submitted 4 months ago* (last edited 4 months ago) by debanqued@beehaw.org to c/bugs@sopuli.xyz
 
 

I installed the Aria2 app from f-droid. I just want to take a list of URLs of files to download and feed it to something that does the work. That’s what Aria2c does on the PC. The phone app is a strange beast and it’s poorly described & documented. When I launch it, it requires creating a profile. This profile wants an address. It’s alienating as fuck. I have a long list of URLs to fetch, not just one. In digging around, I see sparse vague mention of an “Aria server”. I don’t have an aria server and don’t want one. Is the address it demands under the “connection” tab supposed to lead to a server?

The readme.md is useless:

https://github.com/devgianlu/Aria2App

The app points to this link which has no navigation chain:

https://github.com/devgianlu/Aria2App/wiki/Create-a-profile

Following the link at the bottom of the page superfically seems like it could have useful info:

“To understand how DirectDownload work and how to set it up go here.”

but clicking /here/ leads to a dead page. I believe the correct link is this one. But on that page, this so-called “direct download” is not direct in the slightest. It talks about setting up a server and running python scripts. WTF.. why do I need a server? I don’t want a server. I want a direct download in the true sense of the word direct.

18
 
 

If fedi node A and node B both have an anti-spam rule, it makes good sense that when a moderator removes a post for spam that it would be removed from both nodes. But what about other cases? Lemmy is a bit blunt and nuance-lacking in this regard.

For example, the parent of this thread was censored despite not breaking any rules. More importantly, it breaks no rules on slrpnk.net. Yet the slrpnk version was also removed.

I’m not sure exactly what the fix is. But in principle an author should be able to ask a slrpnk admin to restore the post in the slrpnk version of that community, so long as no slrpnk rules are broken by the post.

It’s one thing for various nodes to federate based on having compatible side-wide rules, but they aren’t necessarily aligned 100% and there are also rogue moderators who apply a different set of rules than what’s prescribed for a community.

19
 
 

If you long-tap an image that someone sent, options are:

  • share with…
  • copy original URL
  • delete image

The URL is not the local URL, it’s the network URL for fetching the image again. When you send outbound images, Snikket stores them in one place, but it’s nowhere near the place where it stores inbound images. I found it once after a lengthy hunt but did not take notes. I cannot find it now. I think it’s well buried somewhere. What a piece of shit.

20
 
 

Those who condemn centralised social media naturally block these nodes:

  • #LemmyWorld
  • #shItjustWorks
  • #LemmyCA
  • #programmingDev
  • #LemmyOne
  • #LemmEE
  • #LemmyZip

The global timeline is the landing page on Mbin nodes. It’s swamped with posts from communities hosted in the above shitty centralised nodes, which break interoperability for all demographics that Cloudflare Inc. marginalises.

Mbin gives a way for users to block specific magazines (Lemmy communities), but no way to block a whole node. So users face this this very tedious task of blocking hundreds of magazines which is effectively like a game of whack-a-mole. Whenever someone else on the Mbin node subscribes to a CF/centralised node, the global timeline gets polluted with exclusive content and potentially many other users have to find the block button.

Secondary problem: (unblocking)
My blocked list now contains hundreds of magazines spanning several pages. What if LemmEE decides one day to join the decentralised free world? I would likely want to stop blocking all communities on that node. But unblocking is also very tedious because you have to visit every blocked magazine and click “unblock”.

the fix


① Nix the global timeline. Lemmy also lacks whole-node blocking at the user level, but Lemmy avoids this problem by not even having a global timeline. Logged-in users see a timeline that’s populated only with communities they subscribe to.

«OR»

② Enable users to specify a list of nodes for which they want filtered out of their view of the global timeline.

21
 
 

While composing this post the Lemmy web client went to lunch. This is the classic behaviour of Lemmy when it has a problem. No error, just infinite spinner. After experimentation, it turns out that it tries to be smart but fails when treating URLs written with the gemini:// scheme.

(edit) It’s probably trying to visit the link for that convenience feature of pre-filling the title. If it does not recognise the scheme, it should just accept it without trying to be fancy. It likely screws up on other schemes as well, like dict, ftp, news, etc.

The workaround is to embed the #Gemini link in the body of the post.

22
 
 

I think the stock Lemmy client stops you from closing a browser tab if you have an editor open on a message, to protect you from accidental data loss.

Mbin does not.

23
 
 

A vast majority of the fediverse (particularly the threadiverse) is populated by people who have no sense of infosec or privacy, who run stock browsers over clearnet (e.g. #LemmyWorld users, the AOL users of today). They have a different reality than street wise people. They post a link to a page that renders fine in the world they see and they are totally oblivious to the fact that they are sending the rest of the fediverse into an exclusive walled garden.

There is no practical way for street wise audiences to signal “this article is exclusive/shitty/paywalled/etc”. Voting is too blunt of an instrument and does not convey the problem. Writing a comment “this article is unreachable/discriminatory because it is hosted in a shitty place” is high effort and overly verbose.

the fix


The status quo:

  • (👍/👎) ← no meaning.. different people vote on their own invented basis for voting

We need refined categorised voting. e.g.

  • linked content is interesting and civil (👍/👎)
  • body content is interesting and civil (👍/👎)
  • linked article is reachable & inclusive (👎)¹
  • linked is garbage free (no ads, popups, CAPTCHA, cookie walls, etc) (👍/👎)

¹ Indeed a thumbs up is not useful on inclusiveness because we know every webpage is reachable to someone or some group and likely a majority. Only the count of people excluded is worth having because we would not want to convey the idea that a high number of people being able to reach a site in any way justifies marginalization of others. It should just be a raw count of people who are excluded. A server can work out from the other 3 voting categories the extent by which others can access a page.

From there, how the votes are used can evolve. A client can be configured to not show an egalitarian user exclusive articles. An author at least becomes aware that a site is not good from a digital rights standpoint, and can dig further if they want.

update


The fix needs to expand. We need a mechanism for people to suggest alternative replacement links, and those links should also be voted on. When a replacement link is more favorable than the original link, it should float to the top and become the most likely link for people to visit.

24
 
 

Some will regard this as an enhancement request. To each his own, but IMO *grep has always had a huge deficiency when processing natural languages due to line breaks. PDFGREP especially because most PDF docs carry a payload of natural language.

If I need to search for “the.orange.menace“ (dots are 1-char wildcards), of course I want to be told of cases like this:

A court whereby no one is above the law found the orange  
menace guilty on 34 counts of fraud..

When processing a natural language a sentence terminator is almost always a more sensible boundary. There’s probably no command older than grep that’s still in use today. So it’s bizarre that it has not evolved much. In the 90s there was a Lexis Nexus search tool which was far superior for natural language queries. E.g. (IIRC):

  • foo w/s bar :: matches if “foo” appears within the same sentence as “bar”
  • foo w/4 bar :: matches if “foo” appears within four words of “bar”
  • foo pre/5 bar :: matches if “foo” appears before “bar”, within five words
  • foo w/p bar :: matches if “foo” appears within the same paragraph as “bar”

Newlines as record separators are probably sensible for all things other than natural language. But for natural language grep is a hack.

25
 
 

I cannot believe how stupid Chromium is considering it’s the king of browsers from a US tech giant. It’s another bug that should be embarrassing for Google.

If you visit a PDF, it fetches the PDF and launches pdf.js as expected. If you use the download button within pdf.js, you would expect it to simply copy the already fetched PDF from the cache to the download folder. But no.. the stupid thing goes out on the WAN and redownloads the whole document from the beginning.

I always suspected this, but it became obvious when I recently fetched a 20mb PDF from a slow server. It struggled for a while to get the whole thing just for viewing. Then after clicking to download within pdf.js, it was crawling again from 1% progress.

What a stupid waste of bandwidth, energy and time.

view more: next ›