So, I've got Bugzilla. Bugzilla takes commits from CVS and adds them as comments to bugs. It then, when displaying bugs, parses out those comments and causes them to link to ViewCVS, with diffs when applicable, in an HTML table.
That table should have:
PATH/FILENAME LAST_VERSION THIS_VERSION DIFF
Each of those should be a link to the ViewCVS.cgi file, with the appropriate bits stacked on the end to get you a view of the file as a whole, the version you replaced, the version you committed, and the diff between the two. It handles new files and deleted files just fine.
The problem is that when the path/filename is very long, it causes the parser to produce garbage.
Given this text: \ncompany/docs/ProductDocs/releaseNote.html 1.9 1.10\n
(all one line, pulled right out of a MediumText entry in a MySQL table)
the code correctly parses it into
However, given *this* text: \ncompany/runtime/file_struct_templates/company/profiles/TI/languages/c/framework/target/oe_header.txt 1.21 1.22\n
(all one line, pulled right out of a MediumText entry in a MySQL table)
The code of the function itself that does the parsing is here. And no, I didn't write it. If I'd written it, there would be more comments.
You can use View Source to see the tables themselves, if you want.
But, here's the question. Why does it garble the parsing when the path/filename is long? And how do I fix that script to make it parse the long ones correctly?
EDIT: There should be newlines before and after the text I'm passing to the function. Sorry.
EDIT#2: Corrected the actual text that's sent to the function.
EDIT#3: Apparently the problem with this code is that there is no problem with this code - it works perfectly for the given inputs, but the inputs provided on the long pathnames are full of extra \ns for no apparent reason. Lovely.
That table should have:
PATH/FILENAME LAST_VERSION THIS_VERSION DIFF
Each of those should be a link to the ViewCVS.cgi file, with the appropriate bits stacked on the end to get you a view of the file as a whole, the version you replaced, the version you committed, and the diff between the two. It handles new files and deleted files just fine.
The problem is that when the path/filename is very long, it causes the parser to produce garbage.
Given this text: \ncompany/docs/ProductDocs/releaseNote.html 1.9 1.10\n
(all one line, pulled right out of a MediumText entry in a MySQL table)
the code correctly parses it into
company/docs/ProductDocs/releaseNote.html 1.9 1.10 View diff
However, given *this* text: \ncompany/runtime/file_struct_templates/company/profiles/TI/languages/c/framework/target/oe_header.txt 1.21 1.22\n
(all one line, pulled right out of a MediumText entry in a MySQL table)
View diff company/runtime/templates/company/output/port/c/zceComponentContainerImplPortsInclude.xslt View diff 1.11 1.12 View diff
The code of the function itself that does the parsing is here. And no, I didn't write it. If I'd written it, there would be more comments.
You can use View Source to see the tables themselves, if you want.
But, here's the question. Why does it garble the parsing when the path/filename is long? And how do I fix that script to make it parse the long ones correctly?
EDIT: There should be newlines before and after the text I'm passing to the function. Sorry.
EDIT#2: Corrected the actual text that's sent to the function.
EDIT#3: Apparently the problem with this code is that there is no problem with this code - it works perfectly for the given inputs, but the inputs provided on the long pathnames are full of extra \ns for no apparent reason. Lovely.
(no subject)
Date: 2008-01-16 06:10 pm (UTC)The stars are right.
Ia! Ia! Cthulhu ftagn!
(no subject)
Date: 2008-01-16 07:09 pm (UTC)Because the first thing the perl script does is split on \n (newlines).
Which puts the string into an array, with each line of text as a new element in the array... (so only one line).
And then it "shifts" the array, meaning it pops the top element off the array leaving nothing inside it.
(no subject)
Date: 2008-01-16 07:12 pm (UTC)Because I can't get the short version to work in the code as presented.
(no subject)
Date: 2008-01-16 07:14 pm (UTC)so "\nSubject: CVS Checkin BRANCH: HEAD FILES CHECKED IN: company/runtime/file_struct_templates/company/profiles/TI/languages/c/framework/target/oe_header.txt 1.21 1.22\n" works.
(no subject)
Date: 2008-01-16 07:15 pm (UTC)(no subject)
Date: 2008-01-16 07:16 pm (UTC)(no subject)
Date: 2008-01-16 07:25 pm (UTC)"Subject: CVS Checkin BRANCH: HEAD FILES CHECKED IN:\ncompany/docs/ProductDocs/releaseNote.html 1.9 1.10\n"
?
Because that works.
(no subject)
Date: 2008-01-16 07:33 pm (UTC)But I'm not sure.
It's possible it's only passing the "\ncompany/docs/ProductDocs/releaseNote.html 1.9 1.10\n" part.
(no subject)
Date: 2008-01-16 08:50 pm (UTC)(no subject)
Date: 2008-01-16 08:57 pm (UTC)(no subject)
Date: 2008-01-16 09:24 pm (UTC)Once it's wrapped in newlines, what the subroutine sees is basically:
A blank line (thus the first blank line in the output)
A line with nothing but the relative path (Which is parsed as being the new version number, due to a quirk in the code)
A line with nothing but the old number and the new number (which are parsed as the relative path of a new entry, and the old number of that entry, as fallout from the same code quirk)
(no subject)
Date: 2008-01-17 12:17 am (UTC)(no subject)
Date: 2008-01-17 12:28 am (UTC)(no subject)
Date: 2008-01-17 02:27 am (UTC)So.
Items come in groups of three.
The first one will ALWAYS be "company/something"
The second one will ALWAYS be a number, or the phrase "NONE"
The third one will ALWAYS be a number, or the phrase "NONE"
That should be REALLY easy to parse on, and then I could just discard all newlines in text.
And I don't really want to filter based on "company/" because I'm trying to convince them to split the damn module, which would mean that that first word might change up some. And the word appears IN the rest of the path, sometimes. But it will ALWAYS be [thing]/[more things that might or might not have slashies].
Of course, those "more things" might include spaces and might include numbers.
So that makes it harder to parse based on numbers.
Perhaps that's why he reversed it? Reverse, parse for number/NONE, parse for number/NONE, parse for filename now that you're sure the version numbers aren't in it? But then you have to know when to stop - and, when it's reversed, I suppose you could always stop at "[space or newline]company/" - since that's an extremely odd combination to have in the middle of a path. Not totally impossible, though.
Grr!
I wonder if I can just find the bit that's "parsing" the long path names and make it stop sticking newlines in.
(no subject)
Date: 2008-01-17 01:02 pm (UTC)He splits from the end, telling perl he wants a maximum of three tokens when he's done (two splits). So even if perl finds more spaces, it won't split them. This means if your pathname contains a space (like windows pathnames are wont to do) it won't oversplit and give you a path like:
company\my
documents\my
projects\somewebspace\some
more paths\Frank!
10.2
10.3