Author Topic: Collection Regex fixes and possible correction steps  (Read 5281 times)

Offline Bart39

  • Overlord
  • ******
  • Posts: 171
  • Helpful: +19/-0
Collection Regex fixes and possible correction steps
« on: 2015-05-08, 06:32:35 AM »
We recently discovered some issues from where we converted from file based to database collection regex that would have caused release in certain groups to have been split or used regex from the wrong group creating malformed releases.
These have been fixed in the Dev branch as of the 8th May but you may want to consider the following steps if you want to correct the affected groups (if on master wait until the fix goes in to master):

EDIT: smaller list of groups actually affected

Affected groups are:
Code: [Select]
alt.binaries.bloaf
alt.binaries.blu-ray
alt.binaries.ebook
alt.binaries.e-book
alt.binaries.e-book.flood
alt.binaries.e-book.magazines
alt.binaries.e-book.rpg
alt.binaries.highspeed
alt.binaries.inner-sanctum
alt.binaries.tvseries
alt.binaries.u-4all

If you don't particularly care about the releases already created then you should just reset the groups
Code: [Select]
php misc/testing/DB/reset_truncate.php drop -- to drop and re-create collections / binaries / parts table(s)

or

php misc/testing/DB/reset_truncate.php true -- to truncate collections / binaries / parts table(s)

if you want to go back and re-download / re-create releases for the affected groups then as well as the above do the following (WARNING this will delete releases back to a certain date for the groups so you would need to backfill them):
the following will delete release for the affected groups back to 9th March (since Dev was affected)

Code: [Select]
for group in bloaf blu-ray ebook e-book e-book.flood e-book.magazines e-book.rpg highspeed inner-sanctum tvseries u-4all
do
php misc/testing/Release/delete_releases.php groupname=equals="alt.binaries.$group" adddate=smaller=`php -r 'echo round(((time() - 1425859200)/60)/60);'`
done
« Last Edit: 2015-05-08, 07:16:35 AM by Bart39 »

Offline hanshansen

  • Decent Indexer
  • ***
  • Posts: 57
  • Helpful: +1/-0
Re: Collection Regex fixes and possible correction steps
« Reply #1 on: 2015-05-17, 10:50:08 AM »
We recently discovered some issues from where we converted from file based to database collection regex that would have caused release in certain groups to have been split or used regex from the wrong group creating malformed releases.
These have been fixed in the Dev branch as of the 8th May but you may want to consider the following steps if you want to correct the affected groups (if on master wait until the fix goes in to master):

What are the symptoms of this problem? I'm seeing a lot of duplicate releases in inner-sanctum where for each song there is a release created. Is this what you were talking about?

Is this fix already in master? Or do I need to switch to the dev branch (and how)? I have hundred of thousands, if not millions of releases in the queue after I backfilled inner-sanctum a couple of weeks ago.

Offline hanshansen

  • Decent Indexer
  • ***
  • Posts: 57
  • Helpful: +1/-0
Re: Collection Regex fixes and possible correction steps
« Reply #2 on: 2015-05-17, 01:04:38 PM »

Or do I need to switch to the dev branch (and how)?

Ok, should have read the FAQ or googled it. But there is no branch "dev", only "remotes/origin/dev" amongst others.
Code: [Select]
nzedb@server:/var/www/nZEDb$ git branch -a
* master
  remotes/origin/HEAD -> origin/master
  remotes/origin/dev
  remotes/origin/dev-db
  remotes/origin/dev-ftsearch
  remotes/origin/dev-li3
  remotes/origin/dev-logging
  remotes/origin/dev-misc
  remotes/origin/dev-openssl
  remotes/origin/dev-pp
  remotes/origin/dev-relsplit
  remotes/origin/dev-reqidtable
  remotes/origin/dev-scrutinizer
  remotes/origin/dev-sphinx
  remotes/origin/dev-sphinxindexes
  remotes/origin/dev-tasky-fixes
  remotes/origin/dev-test
  remotes/origin/dev-tmuxrevamp
  remotes/origin/dev-utility
  remotes/origin/dev-xdnzb
  remotes/origin/master
  remotes/origin/master-debug
  remotes/origin/next-master

When I do "git checkout dev" and "git pull" I get the message "Already up-to-date" and I seem to have created a local dev branch.

What am I doing wrong?

Edit: I think I figured it out and it shows that I'm a "git noob". Wenn I "git pull" I actually pull all branches and then I can switch between them with "git checkout <branch>", right? I noticed that the Changelog changed when I was playing around with the various git commands...
« Last Edit: 2015-05-17, 02:20:48 PM by hanshansen »

Offline seb

  • Junior Indexer
  • **
  • Posts: 23
  • Helpful: +3/-0
Re: Collection Regex fixes and possible correction steps
« Reply #3 on: 2015-05-18, 08:13:40 AM »
We recently discovered some issues from where we converted from file based to database collection regex that would have caused release in certain groups to have been split or used regex from the wrong group creating malformed releases.
These have been fixed in the Dev branch as of the 8th May but you may want to consider the following steps if you want to correct the affected groups (if on master wait until the fix goes in to master):



What are the symptoms of this problem? I'm seeing a lot of duplicate releases in inner-sanctum where for each song there is a release created. Is this what you were talking about?

Is this fix already in master? Or do I need to switch to the dev branch (and how)? I have hundred of thousands, if not millions of releases in the queue after I backfilled inner-sanctum a couple of weeks ago.

Yes, i think thats what Bart39 tells us here. So we will have to wait until the first monday of June - which would be actually 1st of June - to get this into "master"-branch, right?

Offline hanshansen

  • Decent Indexer
  • ***
  • Posts: 57
  • Helpful: +1/-0
Re: Collection Regex fixes and possible correction steps
« Reply #4 on: 2015-05-18, 03:05:46 PM »
Are you sure it happens only in the groups you mentioned and it is solved in dev? I switched to the dev branch yesterday and I still see this behaviour in a.b.mom right now.

Example: (The Ventures Live in Japan 1965) [29/35] - "29 Journey To The Stars.mp3" yEnc (1/13)
Posted: 2015-05-17 11:52:29

There is a release created for each file.

Edit: Just to add a small detail, the releases created were in the queue already, I didn't fetch headers yet after switching to dev. I want the queue empty before I fetch new headers.
« Last Edit: 2015-05-18, 03:21:48 PM by hanshansen »

Offline Wally73

  • Overlord
  • ******
  • Posts: 260
  • Helpful: +28/-1
  • i'm nuts
Re: Collection Regex fixes and possible correction steps
« Reply #5 on: 2015-05-18, 11:52:12 PM »
when you switched to dev branch did you ran

Code: [Select]
php /var/www/nZEDb/cli/update_db.php true ???

Offline hanshansen

  • Decent Indexer
  • ***
  • Posts: 57
  • Helpful: +1/-0
Re: Collection Regex fixes and possible correction steps
« Reply #6 on: 2015-05-19, 01:01:03 AM »
when you switched to dev branch did you ran

Code: [Select]
php /var/www/nZEDb/cli/update_db.php true ???

yes

Code: [Select]
nzedb@server:~$ php /var/www/nZEDb/cli/update_db.php true
Db updater starting ...
Looking for unprocessed patches...
Info: Nothing to patch, you are already on version 341

Ok, the release was created before I switched to dev. At the moment I'm only doing postprocessing. I only wanted to point out that it happened in a.b.mom too. Though it was the only example I've seen so far. Is it possible, that that example was a crosspost to one of the affected groups?
« Last Edit: 2015-05-19, 03:34:32 AM by hanshansen »

Offline Bart39

  • Overlord
  • ******
  • Posts: 171
  • Helpful: +19/-0
Re: Collection Regex fixes and possible correction steps
« Reply #7 on: 2015-05-19, 11:58:36 AM »
Bare in mind this is not going to "fix" every split release, just we discovered a problem and fixed it with some of the regexes for the groups I mentioned