Author Topic: Handling server download failures  (Read 3064 times)

Offline joecool

  • Newbie
  • *
  • Posts: 2
  • Helpful: +0/-0
Handling server download failures
« on: 2016-02-04, 12:30:43 PM »
There seems to be a problem in the way articles are being downloaded when the server responds with a broken result.  Here's what I see:

New group alt.binaries.hdtv starting with 100,000 messages worth. Leaving 20,000 for next pass.
Server oldest: ...

Getting 20,000 articles ...
Received 20000 articles of 20,000 requested ...
32.15s to download articles, ...

Getting 20,000 articles ...
Getting 20,000 articles ...
Getting 20,000 articles ...
Getting 20,000 articles ...
0.82s to download articles, ...

Group alt.binaries.hdtv processed in ...

Note that the first 20k records came down fine, but the 2/3/4th sets didn't.  Then the 5th set comes in.  At the end of this, I have missed 60k parts in the middle of a set of 100k.  They are not marked as missed and they won't be retried - they're just gone.

I tried to track down what's going on.  What I find is that it runs this in Binaries.php starting at line 420:
Code: [Select]
// Get article headers from newsgroup.
$scanSummary = $this->scan($groupMySQL, $first, $last);

// Check if we fetched headers.
if (!empty($scanSummary)) {
   ... do stuff here ...
   $this->_pdo->queryExec(
sprintf('
UPDATE groups
SET last_record = %s, last_record_postdate = %s, last_updated = NOW()
WHERE id = %d',
$this->_pdo->escapeString($scanSummary['lastArticleNumber']),
$this->_pdo->from_unixtime($scanSummary['lastArticleDate']),
$groupMySQL['id']
)
);
} else {
// If we didn't fetch headers, update the record still.
  $this->_pdo->queryExec(
sprintf('
UPDATE groups
SET last_record = %s, last_updated = NOW()
WHERE id = %d',
$this->_pdo->escapeString($last),
$groupMySQL['id']
)
);
}
This says that if the download of headers fails and we get nothing back from the server, just mark the last_record on the group as if you did.  That seems wrong.

Digging in further, it looks like scan() might be hitting this on line 585:
Code: [Select]
if ($msgCount < 1) {
return $returnArray;
}

What I've done to "fix" this locally is:
if (!empty($scanSummary)) {
   ... do stuff here ...
   $this->_pdo->queryExec(
sprintf('
UPDATE groups
SET last_record = %s, last_record_postdate = %s, last_updated = NOW()
WHERE id = %d',
$this->_pdo->escapeString($scanSummary['lastArticleNumber']),
$this->_pdo->from_unixtime($scanSummary['lastArticleDate']),
$groupMySQL['id']
)
);
} else {
$done = true;  <-- My change
}

This makes it fail hard if the server completely barfs on one set of messages.

New group alt.binaries.hdtv starting with 100,000 messages worth. Leaving 20,000 for next pass.
Server oldest: ...

Getting 20,000 articles ...
Received 20000 articles of 20,000 requested ...
32.15s to download articles, ...

Getting 20,000 articles ...
<barf>
Group alt.binaries.hdtv processed in ...

The thing here is that it will not mark that 2nd set of headers as downloaded if they actually fail.  It further will not try to the 3/4/5th set of headers until the next run - this keeps everything in order correctly.

FWIW, Newznab "handles" this by throwing "error 1000, empty gzip stream" or something of the sort.  Their result is the same as what I did above as in it immediately fails instead of continuing.
« Last Edit: 2016-02-04, 02:26:15 PM by Wally73 »

Offline Wally73

  • Overlord
  • ******
  • Posts: 260
  • Helpful: +28/-1
  • i'm nuts
Re: Handling server download failures
« Reply #1 on: 2016-02-04, 02:24:39 PM »
create a pull request on github

ps. i've added code tags around the code makes it easier ;)
« Last Edit: 2016-02-04, 02:26:39 PM by Wally73 »

Offline joecool

  • Newbie
  • *
  • Posts: 2
  • Helpful: +0/-0
Re: Handling server download failures
« Reply #2 on: 2016-02-04, 02:29:13 PM »
I didn't try to do a change to the code just because I'm so new to this codebase.

Specifically, should there be some retries instead of just dying?  There's retry stuff in the site config, but there is no indication that such a failure will kick it off.

"Upgrading" nZEDb code to match what Newznab does doesn't seem like the goal  :)