New maintainer for the statistics package

I will be happy to contribute and maintain the statistics package, after all this package has been very helpful to my work over the past years. I will certainly need some help though because I am totally inexperienced with the workflow involved in this task. Most of my coding is done locally for research purposes. Today was the first time ever I made a pull request, so you get the idea. Nevertheless, I am willing to put a few extra hours a week to maintaining/expanding the statistics package.

1 Like

Thank you for your interest in this package. As nobody spoke up against so far, there should be no trouble with having you as new statistics package maintainer.

The previous maintainer stepped down in 2018 Octave Forge / statistics / Commit [4481d4] and Re: Octave 4.4.0 and statistical functions.

The first thing with maintaining a package on Octave Forge is to create a SourceForge account, finally the package is hosted using not git, but mercurial (hg) Octave Forge / statistics / [b7d349]. If you are not familiar with hg, we can switch the version control system to git, this as new maintainer is up to your decision :wink:

Thank you for invitation. I will put my effort to learn how to deal appropriately with the maintenance requirements and continue the improvement of the statistics package. It would be helpful to stick with git, which I 've used already, since I am totally unaware with hg. Another question since octave forge is in decline and new packages are welcomed to github, wouldn’t be better to switch from SourceForge to GitHub as the main repository? We could sync the repo at SourceForge for reference purposes or just publish every new release in Individual packages there in so Octave users can use the pkg -forge option.

1 Like

You are right, Octave Forge is in decline, however, this process is still ongoing without a clear timeline (mostly due to my lack of time finishing the new pkg tool).

In the meantime, the Octave project made the decision to go with SourceForge for it’s current pkg tool, thus we have to support this. As you might guess, this is the reason, why many maintainers left the work intensive Octave Forge ecosystem. But most of the work is left with me currently.

I like your idea to mainly stick on GitHub and only sync with SourceForge in the event of an release. Honestly, those SourceForge release tickets are mostly manually created paperwork. I neither need nor miss them and recently rely mostly on the GitHub Actions pipeline on Octave Packages for quality control. However, to sync with SourceForge, you still need an account there and you have to push your work once a release is done there. I’ll guide you when we are this far :wink:

I made an account in SourceForge for syncing when the time comes. For the time being, I will clone the tip [b7d349] of the statistics repo st source forge and make a copy in github, and start working with the bugs and patches you outlined in your previous post other pending 41 patches .
Is there any proper way to transfer the complete history of the package’s commits from forge to github? and would this be something we want? or should we start clean from the current tip?

1 Like

No need to start something from scratch. If you give me a day or two, I prepare a SourceForge git Version of the statistics package that you can clone to GitHub with ease.

ok, thanks. Let me know when it’s ready to make the transition.

1 Like

@pr0m1th3as Thank you for your patience. I finally relocated the Mercurial repository to

and created a git clone here:

The guide I followed. Please verify that all commits are available. Now you can clone the SourceForge git repo to GitHub for development and in the event of a release, also push all your changes back to SourceForge. To do this, I need your SourceForge username to add you here Members .

for any future hacking/patches - I assume it’s preferred to work with a git clone & patches vs hg moving forward?

@siko1056 thanks for making the transition to git. I also had fast-export in mind but I am glad you took care of it. I checked the commits between old-hg and git version. It appears that the only commits missing are those related to adding a tag release prior to making the release. I suppose that this might be the reason why previous releases are not shown in the new git-based repo. All other commits appear intact with different commit ids of course as it is expected. e.g. [b7d349] default tip / => [840643] main /
My SourgeForge username is pr0m1th3as (unsurprisingly :upside_down_face:).

@nrjank I will transfer asap the git version to a new repository @github under GNU Octave · GitHub. My initial goal is to close all the existing open tickets from Savanna shown by @siko1056 other pending 41 patches before making a new release. Since my contributions will be regularly updated in github repository and pushed back in sourceforge in the event of a release, I would suggest that new patches should be applied there. I will make an announcement once it’s ready. It would be ideal if new issues would be explicitly opened at github repo as well. But I am not sure how this can be achieved so I will be checking Savanna periodically. I assume the old-hg repo at sourceforge will be frozen at its current state.

When making such conversion, the commit/changeset hash changes. We’ll have comments on the bug tracker that reference the changeset hash, and even commit messages that point to another changeset by their hash. Is it not possible to at least add them to the git commit message as part of the conversion to git? If it’s mentioned on the git commit message we still lose fast reference to the right commit but we could at least search messages for references to them.

Even adding them to the commit message (I can’t say how) it would still involve some serious effort tracking them down. What about if we kept the mercurial repo with its original name (instead of *-old-hg) at its current state so that the links (up to this point) from savanna redirect properly, and from this point forward (timewise) we add the commits on bugs or patches to the new repo @github?
In such case, we wouldn’t be able to push back new changes, but we could still upload the tarballs of new releases at Octave Forge - Browse /Octave Forge Packages/Individual Package Releases at SourceForge.net for the pkg -forge option to work properly. Or at least I think so, someone more experienced with the pkg command will say if this is a viable option.

A mercurial patch is in its essence diff output, which can be applied via patch. As many patches are on the savannah tracker for a long time, I doubt in general about the usefulness of those contained hg IDs. Still we have the old hg repo available as reference if necessary, just add “-old-hg” to the few referenced urls and you are done. Refactoring of those patches has to be done anyways and it will pay off in the long time having a proper maintainer for the package again :slightly_smiling_face:

@pr0m1th3as I think it is best to move forward with the development and not trying to please old pattern, which have burned out a developer in the past. Make it your repo, decide what is best for your workflow and satisfaction. The community and those eager to see their old patches applied now have an amazing new chance to get heard and will find ways to work with you :wink: The repo is mostly unmaintained for years. At least I am happy you start taking care and if real not hypothetical problems show up, I am happy to help. Thank you!

I will be working on the github repo for my convenience, so if you want to revert the name of the mercurial repository to its original so that the links work properly, I am quite fine with that. Honestly, I think that it might be even preferable to do so (revert the name of *-old-hg) and make it read only so that all new patches are only pushed to github. I’ve already created a new repo at GitHub - gnu-octave/statistics: The Statistics package for GNU Octave and for the time being I am working locally until we decide how to play along with this transition. So keeping its duplicate git repo at SourceForge just adds up to the complexity updating it. One idea is that I could make the 1.4.4 release on github at its current stage (identical to mercurial) before pushing any new changes from the bugs and patches left open in savanna yet. In such a case, we will just upload the tarball of the new release at source forge and that’s it, no changes or new release from the mercurial statistics repo and its git version will just get deleted, since there is the one at github.

Another thing I would like help with, is how to close the tickets in savanna. I’ve made a request for inclusion in the GNU Octave project but I am not sure if that’s only what it takes. Apologies for the long replies and numerous questions, this is quite new to me so some help is necessary and it also feels better to collectively make decisions about the statistics package, since I 've just started and I wouldn’t like my decisions to appeal invasive to other contributors and previous maintainers.

I too made a conversion of the statistics package that:

  1. includes the original hg tags (which github marks as releases)
  2. appends the original hg hash to the end of commit message
  3. also appends the original svn revision number to the end of the commit message (this information was in the mercurial changeset extra data)
  4. replaces the bug and patch number with a note whether they come from (some are savannah numbers but others are sourceforge numbers. This also prevents gtihub from linking issue number incorrectly)

It’s available at GitHub - carandraug/octave-statistics

The only thing I didn’t do was cleaning the authors names. If the above is of any interest, @siko1056 if share the authors file list that you used for you conversion, I can make a git repo that also cleans that.

This is great @carandraug . Can you share your steps that we can reproduce it with other repos too.

I did not do ether. Looking at GitHub the default seems fairly enough preserved.

@pr0m1th3as if you did not start too much already, you can decide to pick up on @carandraug 's work now and we push force his repo to GitHub and SourceForge. It is your decision.

This is unfortunately working against the rules of OctaveForge, but you do not have to bother about them at all, I keep OctaveForge consistent, until the transition to Octave Packages is done. Then I will push your GitHub repo to OctaveForge once you release.

Just for the clarification: In Octave Forge, each package can only have exact one repository with exactly the same name as the package itself (the url on the OctaveForge page is auto generated, no flexibility). This unique repo must point to the actual source code of the distributed Octave package (GPLv3, some package components are generated by scripts, etc.) and I am not willing to make a transition between git and hg with each release.

Added you to the SourceForge group as Developer. But as I said, inform me about a release on GitHub and I will do the necessary steps on Octave Forge. Do not bother about this for now.

To get the tags and releases on github, I only did git push --tags. I think you probably already had them but just didn’t push them. For the other things I hacked the fast-export plugin issue_prefix like so:

diff --git a/plugins/issue_prefix/__init__.py b/plugins/issue_prefix/__init__.py
index 5dd30b5..56b245d 100644
--- a/plugins/issue_prefix/__init__.py
+++ b/plugins/issue_prefix/__init__.py
@@ -12,6 +12,46 @@ class Filter:
         self.prefix = args
 
     def commit_message_filter(self, commit_data):
-        for match in re.findall(b'#[1-9][0-9]+', commit_data['desc']):
-            commit_data['desc'] = commit_data['desc'].replace(
-                match, b'#%s%s' % (self.prefix, match[1:]))
+        commit_data['desc'] = re.sub(
+            b'bugs #45671, #45670',
+            b'#savannah-bug:45671, #savannah-bug:45670',
+            commit_data['desc'],
+        )
+        commit_data['desc'] = re.sub(
+            b'Closes #145',
+            b'Closes #SF-bug:145',
+            commit_data['desc'],
+        )
+        commit_data['desc'] = re.sub(
+            b'Closes bug #147',
+            b'Closes #SF-bug:147',
+            commit_data['desc'],
+        )
+        commit_data['desc'] = re.sub(
+            b'bug#34765 in savannah',
+            b'#savannah-bug:34765',
+            commit_data['desc'],
+        )
+        commit_data['desc'] = re.sub(
+            b'patch #([1-9][0-9]+)',
+            lambda x: b'#' + self.prefix + '-patch:' + x.group(1),
+            commit_data['desc'],
+        )
+        commit_data['desc'] = re.sub(
+            b'(b|B)ug ?#([1-9][0-9]+)',
+            lambda x: b'#' + self.prefix + '-bug:' + x.group(2),
+            commit_data['desc'],
+        )
+
+        if not commit_data['desc'].endswith('b\n'):
+            commit_data['desc'] += b'\n'
+
+        commit_data['desc'] += b'\n'
+
+        if ('extra' in commit_data
+            and 'convert_revision' in commit_data['extra']
+            and commit_data['extra']['convert_revision'].startswith("svn:")):
+
+            commit_data['desc'] += commit_data['extra']['convert_revision'].encode() + b'\n'
+
+        commit_data['desc'] += b'hg:' + commit_data['hg_hash'].encode()

There were a couple of special cases which I found by scanning the commit messages for # and then wrote the corresponding rules above. This could be adapted by other projects in the future.

Finally I called hg-fast-export like so:

path-to-fast-export/hg-fast-export.sh -r path-to-hg-repo/ --plugin=issue_prefix=savannah

An alternative to adding the svn rev and hg hash to the git commit message is to add them on git notes. This is what hg-fast-export does with the --hg-hash option. However, I think that those can easily be left behind (need to be pushed separately) and can be modified aftewards.

OK. In that case, I generated an authors file now. What do you think of it?

"aadler"                                               = "Andy Adler <aadler@users.sf.net>"
"adb014"                                               = "David Bateman <adb104@users.sf.net>"
"Alastair Harrison <aharrison24@gmail.com>"            = "Alastair Harrison <aharrison24@gmail.com>"
"Andreas Bertsatos <abertsatos@biol.uoa.gr>"           = "Andreas Bertsatos <abertsatos@biol.uoa.gr>"
"Anthony Morast <anthony.a.morast@gmail.com>"          = "Anthony Morast <anthony.a.morast@gmail.com>"
"A.R. Burgers <arburgers@gmail.com>"                   = "A.R. Burgers <arburgers@gmail.com>"
"Arno Onken <asnelt@asnelt.org>"                       = "Arno Onken <asnelt@asnelt.org>"
"Arno Onken <asnelt@users.sourceforge.net>"            = "Arno Onken <asnelt@asnelt.org>"
"asnelt"                                               = "Arno Onken <asnelt@asnelt.org>"
"asnelt@users.sourceforge.net"                         = "Arno Onken <asnelt@asnelt.org>"
"whyly"                                                = "Arno Onken <asnelt@asnelt.org>"
"axkma"                                                = "axkma <axkma@users.sf.net>"
"carandraug"                                           = "David Miguel Susano Pinto <carandraug@octave.org>"
"Carnë Draug <carandraug@octave.org>"                  = "David Miguel Susano Pinto <carandraug@octave.org>"
"Colin Macdonald <cbm@m.fsf.org>"                      = "Colin Macdonald <cbm@m.fsf.org>"
"Dag Lyberg <daglyberg80@gmail.com>"                   = "Dag Lyberg <daglyberg80@gmail.com>"
"etienne"                                              = "Etienne Grossmann <etienne@users.sf.net>"
"fpoto"                                                = "Francesco Potortì <fpoto@users.sf.net>"
"hauberg"                                              = "Søren Hauberg <hauberg@users.sf.net>"
"John D"                                               = "John Donoghue <john.donoghue@ieee.org>"
"John Donoghue"                                        = "John Donoghue <john.donoghue@ieee.org>"
"John Donoghue <john.donoghue@ieee.org>"               = "John Donoghue <john.donoghue@ieee.org>"
"jpicarbajal"                                          = "Juan Pablo Carbajal <ajuanpi+dev@gmail.com>"
"Juan Pablo Carbajal <ajuanpi+dev@gmail.com>"          = "Juan Pablo Carbajal <ajuanpi+dev@gmail.com>"
"JuanPi Carbajal <ajuanpi+dev@gmail.com>"              = "Juan Pablo Carbajal <ajuanpi+dev@gmail.com>"
"Lachlan Andrew <lachlanbis@gmail.com>"                = "Lachlan Andrew <lachlanbis@gmail.com>"
"Michael Leitner <michael.leitner@frm2.tum.de>"        = "Michael Leitner <michael.leitner@frm2.tum.de>"
"Nicholas R. Jankowski <jankowskin@asme.org>"          = "Nicholas R. Jankowski <jankowski.nicholas@gmail.com>"
"Nicholas R. Jankowski <jankowski.nicholas@gmail.com>" = "Nicholas R. Jankowski <jankowski.nicholas@gmail.com>"
"nir-krakauer"                                         = "Nir Krakauer <mail@nirkrakauer.net>"
"Nir Krakauer <mail@nirkrakauer.net>"                  = "Nir Krakauer <mail@nirkrakauer.net>"
"Nir Krakauer <nirkrakauer@gmail.com>"                 = "Nir Krakauer <mail@nirkrakauer.net>"
"Nir Krakauer <nkrakauer@ccny.cuny.edu>"               = "Nir Krakauer <mail@nirkrakauer.net>"
"Nir-Krakauer <nkrakauer@ccny.cuny.edu>"               = "Nir Krakauer <mail@nirkrakauer.net>"
"Olaf Till <i7tiol@t-online.de>"                       = "Olaf Till <i7tiol@t-online.de>"
"Oliver Heimlich <oheim@posteo.de>"                    = "Oliver Heimlich <oheim@posteo.de>"
"Pascal Dupuis <cdemills@gmail.com>"                   = "Pascal Dupuis <cdemills@gmail.com>"
"Philip Nienhuis <prnienhuis@users.sf.net>"            = "Philip Nienhuis <prnienhuis@users.sf.net>"
"Piotr Dollar <pdollar@gmail.com>"                     = "Piotr Dollar <pdollar@gmail.com>"
"pkienzle"                                             = "Paul Kienzle <pkienzle@users.sf.net>"
"Rafael Laboissiere <rafael@laboissiere.net>"          = "Rafael Laboissiere <rafael@laboissiere.net>"
"schloegl"                                             = "Alois Schloegl <alois.schloegl@gmail.com>"
"sis-sou"                                              = "sis-sou <sis-sou@users.sf.net>"
"Stefano Guidoni <ilguido@users.sf.net>"               = "Stefano Guidoni <ilguido@users.sf.net>"
"Steven Waldrip"                                       = "Steven Waldrip"
"tealev"                                               = "Torok Levente <tealev@users.sf.net>"
"thomas-weber"                                         = "Thomas Weber <thomas-weber@users.sf.net>"
"wsloand"                                              = "Bill Denney <wsloand@users.sf.net>"
"wwwandy"                                              = "Andreas Weber <wwwandy@users.sf.net>"
"XT Zhou <terrencestark@hotmail.com>"                  = "XT Zhou <terrencestark@hotmail.com>"

I was unsure what to do about many of those. In most cases I knew everyone but not sure what to use as their emails. In the cases of people I contacted many times, I picked an email address I knew of. In others, even though I sometimes had an email address, I just used their real name and the sourceforge email address because even though no one uses the SF email at least shows the sourceforge username which can be used as starting point to track down a person.

There were two cases where I have no idea:

  • sourceforge username sis-sou which made some early svn commits. Absolutely no clue about this.

  • “Steven Waldrip” which I’m not sure is the real name. His commits point to savannah’s bug #59924, patch #10016, and patch #10019 but come from anonymous comments. @PhilipN is the one who pushed the commits and may know better.

1 Like

I cloned @carandraug 's repo into a new statistics repo at GitHub so we start with an identical history and hashes to the original hg at SourceForge. Unfortunatetly, I couldn’t do that (in a clean manner at least) without deleting the previous statistics repo, but since I don’t have owner rights, I just renamed the initial statistics repository to statistics1, which can now be deleted.

I leave SourceForge to you. What would be helpful, is having a way to close the statistics related bugs and patches in savanna as I am already working on the open issues therein.