Home My Page Projects FusionForge
Summary Activity Forums Tracker Lists News SCM Files Mediawiki Hudson/Jenkins

[#768] scmhook/committracker with scmsvn has performance issue on tagging

2015-04-28 14:57
Submitted by:
Bernd Laengerich (progbernie)
Assigned to:
Nobody (None)
Target Release:
Found in Version:
scmhook/committracker with scmsvn has performance issue on tagging

Detailed description
tag module in svn
committracker runs after commit using file plugins/scmhook/library/scmsvn/hooks/committracker/post.php:

/usr/bin/php -d include_path=/usr/share/php5/PEAR:/etc/gforge//custom:/etc/gforge/:/usr/share/fusionforge/common:/usr/share/fusionforge/www:/usr/share/fusionforge/plugins:/usr/share/fusionforge:/usr/share/fusionforge/www/include:/usr/share/fusionforge/common/include:.:/usr/share/php:/usr/share/pear /usr/share/fusionforge/plugins/scmhook/library/scmsvn/hooks/committracker/post.php /var/lib/gforge/scmrepos/svn/testmodule 26369

The program finds out changed files with 'svnlook changed -r 26369 /var/lib/gforge/scmrepos/svn/testmodule'

Result is:

A tags/TEST_2_3_4/Test/
D tags/TEST_2_3_4/Test/src/Kaputt.java
A tags/TEST_2_3_4/Test/src/Kaputt.java

The result is filtered to get the list of changed files only and looped through:

foreach ($changed as $onefile) {

Starting with the given revision minus one, all change sets for each revision is retrieved using 'svnlook changed'
The result is searched for the current file. If it is found, the revision is used as last revision. Otherwise the revision is decremented by one and it is retried.

Given the above example, the first "file" is the project directory itself that was created with the tag itself and never appears in the change log

Using the above example, the revision before states:
fusionforge:~ # svnlook changed -r 26368 /var/lib/gforge/scmrepos/svn/testmodule
A tags/TEST_2_3_4/

The loop iterates over each revision, invocating svnlook more than 26000 times for each file.
At my setup, it takes about 30minutes to perform the tag.

As far as I see, the same procedure is used with scmgit
Message  ↓
Date: 2015-05-05 14:32
Sender: Sylvain Beucler

Applied as 02950154516ac02f992e1dde7417f19f5522ee15
Thanks for the patch !

Date: 2015-04-30 07:48
Sender: Bernd Laengerich

Based on current revision of scmsvn/hooks/committracker/post.php
patch -i scmsvn_hooks_committracker_post_php.patch -o post.php.new post.php

Date: 2015-04-29 19:43
Sender: Bernd Laengerich

After discussion with Sylvain Beucler and some experiments, I fixed the performance issues. My solution contains two steps:
1. Refactoring the code to first check if the commit log refers to any tasks or artifacts. If not, die. This will prevent the code to dive into the svn history in case no tracker artifact or task is refernced in the commit.
2. For each file get the revision history into an array and search for the current revision. Then get the next older revision from the array, if any. This erases the need for looping through all revisions.

As I was experimenting on 5.3rc4, I will attach a patch file tomorrow that will be based on the current revision of post.php.

I would like to thank Sylvain for his support and discussion.

Date: 2015-04-28 16:10
Sender: Sylvain Beucler

It looks like you only read the first line of my previous answer, but aside from that I agree.

Date: 2015-04-28 15:56
Sender: Bernd Laengerich

I would like to disagree as in our business we use the before/after revision of each file in the tracker.
I made some more tests and found, that the whole process is not needed as TAGGING a set of files never has a revision before (and it iterates over all svn revisions with all tagged files without hit), it is just a snapshot without history. So the committracker should be ignored on tagging.
If we have a change set committed there should be a better way in retrieving the history of each file instead of iterating the svn revisions.
One possible solution may be to use
svn log -l 2 "file://"; . $repository $onefile and parse the output

Date: 2015-04-28 15:13
Sender: Sylvain Beucler

One solution is to trim the number of info we provide (the before/after version of *each* file is not particularly important IMHO).

If we want to keep all these pieces of info, we could use something like:
$ svnlook history /srv/svn/myrepo trunk/README -l1
-------- ----
9278 /trunk/README


Size Name Date By Download
2 KiBscmsvn_hooks_committracker_post_php.patch2015-04-30 07:48progberniescmsvn_hooks_committracker_post_php.patch
Field Old Value Date By
status_idOpen2015-05-05 14:32beuc-inria
close_dateNone2015-05-05 14:32beuc-inria
Target ReleaseNone2015-05-05 14:32beuc-inria
ResolutionNone2015-05-05 14:32beuc-inria
File Added501: scmsvn_hooks_committracker_post_php.patch2015-04-30 07:48progbernie