autogits/doc/README.md

102 lines
4.3 KiB
Markdown
Raw Normal View History

2024-08-18 19:18:21 +02:00
Introduction
============
The OBS (Open Build Service) was created at a time when VCS field was still
evolving. One of the main issues not handled at the time was handling of large
files. Large files are at the core of package sources -- think, upstream
sources.
Today, this has changed. Git is the most popular and widely used VCS in history.
Entire businesses are build around providing services for Git. Git has also
ability to deal with large files via Git LFS subsystem.
Here, we'll detail how Git and Git LFS can be leveraged to provide a superior
contributor environment while increasing flexibility and transparency and
tracability in OBS project management.
Overview of current project
---------------------------
OBS is used to build projects. It doesn't build package or images, but it *only* builds
projects. And while package management is exposed directly, project management is
hidden behind APIs, legacy workflows and a monolithic codebase.
The goal of this project is to move *project* management to Git and facilitate
*project* workflow via external, adaptable helpers. As a consequence, OBS will
be used to build any project, in Git or in legacy VCS internal to OBS. But
Git-based projects will no longer be curtailed by internal OBS machinery and can
adapt any project specific workflow in a modular fashion, without the need to
change OBS sources.
The goal is to move current workflows for openSUSE:Factory, as well as SLFO,
out of OBS and into Git. OBS will still be used to build such projects, but
everything else, from approvals to maintainer definitions, to project configs,
must be moved to Git. Doing so will not only simplify workflows for package
maintainers but also will inject transparency in project history and secure our
infrastructure with modern cryptography.
How Git works
-------------
Git contains only 4 basic objects:
* blob -- file data
* tree -- directory listing, contains other trees or blob or commit, or
commits (aka, submodules)
* commit -- this contains parent commit information, tree objects, forming
unchangable, sealed history backed by a cryptographic hash function (kind of like a
Bitcoin blockchain)
* tags -- additional labels associated with commits
A good way of thinking about Git is not as a VCS, but as a multi-version file
system, where each revision is sealed by the new revisions.
Each of the objects is represented internally as part of another object via SHA256. Therefore, integrity of Git, along with entire evolution of the sources, is backed by SHA256.
In contrast, integrity function used by OBS is MD5.
Workflow of Git Projects
------------------------
OBS connects package with project. A commit to a package, updates the projects
where the instance of the package resides.
Git does not contain notion of projects and packages. It simply manages source
trees. The work associated with managing a project and its packages is now done
externally.
The basic structure of a Git managed project is below. The package repository
contains package sources, while project repository contains all the information
associated with the project, including pointers (aka, git submodules) of all the package sources.
![ProjectGit submodule points to commit in PackageGit](project.svg "ProjectGit with corresponding PackageGit")
An update in package must be represented as an update to the project. This can
happen in two ways. Either direct using `prjgit-updater`, which updates the
project git directly on pushes,
![ProjectGit submodule is updated following push to PackageGit](project-update.svg "ProjectGit update following PackgeGit push")
or indirectly via `pr-review` workflow, which updates project git via PR
workflow,
![PackageGit submodule is updated via PR rerouted through ProjectGit](project-pr-update.svg "ProjectGit PR mergesPackgeGit PR")
In all cases, the project must be updated for the changes to be built. This is
akin to OBS today, except that the project is an internal state, mostly hidden
from inspection.
Centralization of package management
------------------------------------
The proposal is to move all "official" package sources under a `/pool`
organization. Each "official" project would then have *one* branch assigned to
help with package updates.
The branches represent the current state of packages in a given project. Basic
package updates follow the `pr-review` workflow.