Twitter | Search | |
This is the legacy version of twitter.com. We will be shutting it down on 15 December 2020. Please switch to a supported browser or device. You can see a list of supported browsers in our Help Center.
Serge Bezborodov
Logs Geek, CTO JetOctopus - crawler & log analyzer, love Big Data, share data insights.
1,676
Tweets
182
Following
677
Followers
Tweets
Serge Bezborodov Nov 17
Replying to @matteogiannone @JohnMu
We definitely need to test it
Reply Retweet Like
Serge Bezborodov Nov 17
Replying to @JohnMu
The problem is someone uses it for scraping content and it seems there is no way to block.
Reply Retweet Like
Serge Bezborodov Nov 16
We saw an impact from ads bot on crawling bot. With huge campaigns, it's possible easily to overload a website without reducing search bot crawling speed. Btw search can slow down by seeing increasing page load time.
Reply Retweet Like
Serge Bezborodov Nov 16
Do you need to scape website under strong anti-scrape system? Just use google mobile friendly test tool. It uses the same IP as gbots and user-agent. is any way to block mobile test tool?
Reply Retweet Like
Serge Bezborodov Nov 12
Replying to @Suganthanmn @nickswan
Nice job, but why only 25k? You can query API more times
Reply Retweet Like
Serge Bezborodov Oct 29
Replying to @sashaborm
да, за последние 6 месяцев есть рост в неск раз, но обьемы по сравнению с гуглом мизерные
Reply Retweet Like
Serge Bezborodov Oct 28
Replying to @Suganthanmn
Apple documentation is a bit incorrect with Desktop user-agent. True desktop is 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_5) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/13.1.1 Safari/605.1.15 (Applebot/0.1; +)', we didn't see Desktop UA as on doc
Reply Retweet Like
Serge Bezborodov Sep 25
Reply Retweet Like
Serge Bezborodov Sep 21
Replying to @DimoBelov
Yes, headless chrome
Reply Retweet Like
Serge Bezborodov Sep 18
Replying to @Suganthanmn
Oh 21-century!
Reply Retweet Like
Serge Bezborodov Sep 18
We're testing JS crawling on enterprise client's websites. The server has 32 processor cores, 94 GB RAM and its load is about 80-90% just for 20 crawler threads! I think JS websites and bitcoin mining added a few percent to global warming.
Reply Retweet Like
Serge Bezborodov Sep 18
Replying to @Suganthanmn
Can't wait! Most https websites already work on HTTP2. I'm curious what impact on performance will be.
Reply Retweet Like
Serge Bezborodov Sep 17
Replying to @alexburaks
мы здесь брали в свое время
Reply Retweet Like
Serge Bezborodov Sep 17
Replying to @alexburaks
алексовский же топ 1М не обновляется давно
Reply Retweet Like
Serge Bezborodov Sep 16
Replying to @Nubrik
буду онлайн вещать
Reply Retweet Like
Serge Bezborodov Sep 14
Replying to @Suganthanmn
If you want to solve a problem with regex - now you have two problems )
Reply Retweet Like
Serge Bezborodov Sep 14
yes, but there are a lot of websites, who send automatic abuses for fake googlebots
Reply Retweet Like
Serge Bezborodov Sep 14
Replying to @jackiecchu
I can't get how they deal with abuses. Some years ago we tried to crawl a few millions of websites under gbot user agent and we receive tons of abuses.
Reply Retweet Like
Serge Bezborodov Sep 14
Replying to @fabioricotta
The problem is facebook crawls other websites as "googlebot". We're going to calculate their total requests amount.
Reply Retweet Like
Serge Bezborodov Sep 14
Replying to @Suganthanmn
Ha-ha!
Reply Retweet Like