Import Export Plugin
- Import/Export Plugin
- The import/export plugin aims to ease the migration of project data between forges. Goal is to allow interoperability between projects hosted on different forges. As data in these forges are usually locked, which means that its quite easy to start a new project but it is not as easy if the project is already mature and the developers decide to migrate to a different forge system.This project will try to solve the following use cases:
- Export a project from FusionForge, along with all its information (issues, trackers, etc.).
- Import a project to FusionForge that has been exported from another forge.
- Ideally define a common standard to represent the data in such way to be easily consumable by other services.
The project will make use of the OSLC format. More details can be found in the specification; also take a look at ForgePlucker (see below).
- Tracker item: #227
- Development branch: -
- Plugin Changelog
- Plugin Changelog for importexport
- in development
- Matrix by Fusionforge Version and by Linux Distribution
- 1 Building blocks
- 2 Web Interface Features
- 3 Links and resources
The import export plugin will have to support different formats as exported from the different types of platforms.
Generic import tools: ForgePlucker
ForgePlucker works by scrapping a forge's pages (generated HTML intended for human view), and attempts to convert them in a parsable format.
There is a coclico fork, with 2 branches:
- master was not ready for real use, but was "the right thing" to do
- overplucker is less clean but more working
This project will be examined further as an alternative of a PHP built-in plugin. Note though that forgeplucker may need to be run by the end-user, because some data extraction requires his/her remote password. The main gain of having such an external tool is that it allows for a centralised import/export functionality without having the undesired performance implication of PHP.
A drawback on this approach is that currently I [nioniosfr] am not that familiar with Python and it might lead to undesired/unexpected delays.
ForgePlucker is mainly targetted at project users, permissions, trackers, and latest changes target documents and file releases.
Code / SCM
Normally the user can easily pull and push a complete repository and should need no assistance. Complete this section if you think otherwise.
SVN has a new svnrdump utility but it's hard to make it work for import:
- a pre-revprop-change hook must be installed
- repository must be completely empty, not even the standard trunk/branches/tags directories
So we need an interface in FusionForge to import SVN repositories, or create an empty repository ready for import as described above.
Exports in RDF / JSON-LD
The data export of a project can also be achieved with the use of RDF or JSON-LD (which is now a standard). This approach doesn't have the drawbacks of converting the incoming data into meaningfull variables, since the data are already described by the format itself. The gains of such an approach are the performance gain, explained above and also the uniform implementation for both import and export mechanisms. Although, it was pointed out that "RDF is just a pivot format for inter-forge exchanges". So this needs to be revised again before proceeding in implementation.
On the same principles of RDF, the OSLC plugin could potentialy offer an export mechanism via its REST interface for the trackers.
GitHub provides an API for developers to access the public or private repositories of its users, as well as all their metadata.
Access to the API is via HTTP GET requests and the user authentication can be realised via basic authentication (username, password) or OAuth2 authentication.
The format of the response can be returned in JSON among other. More information about media types.
An example cURL request to retrieve all the issues of the project redcarpet of user vmg.
Currently there are many different types of plugins available for extracting the project metadata from Google Code web-site.
A few of these project can be found with a simple search (GitHub project).
There was an API provided by Google for the issue trackers, but as stated on the official documentation it has been shut down.
- google code issues migrator Project that exports issues stored in google code and imports them on a github users repository.
- youTrack python library youTrack's python import library for issues of different platforms
We already have some import/export scripts for FusionForge:
Import/export for Mediawiki_Plugin should be available, including for private projets, for the end-user.
Current status: daily XML export for public projects only, see MediaWiki plugin page.
- import/export for Mailman users and configuration
- mass import/export for Mailman archives (currently: only per-month archive mbox with e-mail obfuscation)
Web Interface Features
The user interface for the import/export plugin will focus on having a clear design. The main access point to the plugin will be a tab under the project's navigation bar, similar to SCM tab for instance. All users will have access to the plugin and it's functionality since it is aimed to promote interoperability.
Since each project can define private and public data, the project's user restrictions will also apply to the import/export plugin's exposed data. Non project members will still have access to the public data and will be able to export those (considering there are public data available).
Design and linking are necessary tasks for the future development of import/export. The user interface will comply with the rest of the FusionForge design principles.
Below are the task that are under implementation.
Select the source to import from
When a user will use the import/export plugin will have the choice to select among different project platforms and select the one he wants to import project data from (Google code, SourceForge, etc.)
Select the objects within FusionForge to import to
At least during the first releases, the plugin will require the user to specify for which objects/categories of FusionForge data are going to be imported. This most likely will change in the future but as of time of writing it is considered as a feature of the web interface.
Select a file to import from
Users will also given the posibility to upload a file from where data can be imported in FusionForge.
Select the format to use for exportation
The export feature will be compliant with different types of formats The user will have the option to select whether the export file will be a comma seperated value of a JSON file for example.
Select the categories of project data to export
Since not all project data will be accessible by everyone, the user will be given the option to reviw which parts should be extracted. This can be considered either as a table with a checkbox next to each category, or a multi-select element where again the user can select multiple entries.
This approach aims to eleiminate the confusion between what data are publicly accessible (for non project members), as well as an overview of which data the user is willing to export (issues, mailing lists, files, etc.).
Links and resources
- OSLC specifications
- Project_Import_Plugin Projectimport plugin
- Foaf_Profiles_Plugin FOAF plugin
- DOAP_RDF_plugin DOAP RDF plugin
- Articles by Olivier Berger:
- Jailbreaking the Forges : project export/import efforts: as part of project Coclico
- http://www-public.telecom-sudparis.eu/~berger_o/papier-oss2013/: article on Linked Data descriptions of Debian source packages
- Three Systemic Problems with Open-Source Hosting Sites: 2009 blog post from forgeplucker developer Eric S. Raymond describing the need for import/export