overview for ColinHayhurst

Google DOJ Trial Exhibit Files, Documents & Responses. in c/technology@lemmy.world

[–] ColinHayhurst@lemmy.world 2 points 3 days ago

Excellent reporting on the trials: https://www.bigtechontrial.com/

Google is no longer asking — feed the AI or you’re not in search results in c/technology@lemmy.world

[–] ColinHayhurst@lemmy.world 2 points 4 weeks ago

Yes.

Google is no longer asking — feed the AI or you’re not in search results in c/technology@lemmy.world

[–] ColinHayhurst@lemmy.world 15 points 1 month ago

Some discussion on that here: https://lemmy.world/comment/11859761

Google is no longer asking — feed the AI or you’re not in search results in c/technology@lemmy.world

[–] ColinHayhurst@lemmy.world 2 points 1 month ago (1 children)

Where is your evidence for that? It used to be Bing and Yandex, but now it's just Bing. They use other non search engine APIs and do a small amount of crawling AFAIK. Details of who uses what here: https://seirdy.one/posts/2021/03/10/search-engines-with-own-indexes/

Google is no longer asking — feed the AI or you’re not in search results in c/technology@lemmy.world

[–] ColinHayhurst@lemmy.world 29 points 1 month ago* (last edited 4 weeks ago) (3 children)

You should put these entries into your robots.txt file.

To block the Google search crawler use for all of your site:

User-agent: Googlebot

Disallow: /

To block the Google AI crawler use:

User-agent: Google-Advanced

Disallow: /

Any “small-web” search engines? in c/technology@lemmy.world

[–] ColinHayhurst@lemmy.world 6 points 1 month ago

Yes, it was. Matt Wells closed it down just over one year ago.

Any “small-web” search engines? in c/technology@lemmy.world

[–] ColinHayhurst@lemmy.world 5 points 1 month ago (1 children)

Any “small-web” search engines? in c/technology@lemmy.world

[–] ColinHayhurst@lemmy.world 4 points 1 month ago (3 children)

https://system1.com/ adtech company syndicating Bing and/or Google

Any “small-web” search engines? in c/technology@lemmy.world

[–] ColinHayhurst@lemmy.world 8 points 1 month ago* (last edited 1 month ago) (3 children)

We'd love to build a distributed search engine, but it would be too slow I think. When you send us a query we go and search 8 billion+ pages, and bring back the top 10, 20....up to 1,000 results. For a good service we need to do that in 200ms, and thus one needs to centralise the index. It took years, several iterations and our carefully designed algos & architecture to make something so fast. No doubt Google, Bing, Yandex & Baidu went through similar hoops. Maybe, I'm wrong and/or someone can make it work with our API.