Distribution
=========================

## Project intentions

**Problem statement and requirements**

* What is the exact scope of the problem?


Design a professional grade and extensible content distribution system, that allows docker users to:

... by default enjoy:

	* an efficient, secured and reliable way to store, manage, package and exchange content

... optionally:

	* can hack/roll their own on top of healthy open-source components

... with the liberty to:

	* implement their own home made solution through good specs, and solid extensions mechanism


* Who will the result be useful to?

	* users
	* ISV (who distribute images or develop image distribution solutions)
	* docker

* What are the use cases (distinguish dev & ops population where applicable)?

	* Everyone (... uses docker push/pull).

* Why does it matter that we build this now?

	* Shortcomings of the existing codebase are the #1 pain point (by large) for users, partners and ISV, hence the most urgent thing to address (?)
	* That situation is getting worse everyday and killer competitors are going/have emerged. 

* Who are the competitors?

	* existing artifact storage solutions (eg: artifactory).
	* emerging products that aim at handling pull/push in place of docker.
	* ISV that are looking for alternatives to workaround this situation

**Current state: what do we have today?**

Problems of the existing system:

1. not reliable
	* registry goes down whenever the hub goes down
	* failing push result in broken repositories
	* concurrent push is not handled
	* python boto and gevent have a terrible history
	* organically grown, under-designed features are in a bad shape (search)
2. inconsistent
	* discrepancies between duplicated API (and *duplicated APIs*)
	* unused features
	* missing essential features (proper SSL support)
3. not reusable
	* tightly entangled with hub component makes it very difficult to use outside of docker
 	* proper access-control is almost impossible to do right
 	* not easily extensible
4. not efficient
	* no parallel operations (by design)
	* sluggish client-side processing / bad pipeline design
	* poor reusability of content (random ids)
	* scalability issues (tags)
	* too many useless requests (protocol)
	* too much local space consumed (local garbage collection: broken + not efficient)
	* no squashing
5. not resilient to errors
	* no resume
	* error handling is obscure or inexistent
6. security
	* content is not verified
	* current tarsum is broken 
	* random ids are a headache
7. confusing
	* registry vs. registry.hub?
	* layer vs. image?
8. broken features
	* mirroring is not done correctly (too complex, bug-laden, caching is hard)
9. poor integration with the rest of the project
	* technology discrepancy (python vs. go)
	* poor testability
	* poor separation (API in the engine is not defined enough)
10. missing features / prevents future
	* trust / image signing
	* naming / transport separation
	* discovery / layer federation
	* architecture + os support (eg: arm/windows)
	* quotas
	* alternative distribution methods (transport plugins)

**Future state: where do we want to get?**

* Deliverable
	* new JSON/HTTP protocol specification
	* new image format specification
	* (new image store in the engine)
	* new transport API between the engine and the distribution client code / new library
	* new registry in go
	* new authentication service on top of the trust graph in go

* What are the interactions with other components of the project?
	* critical interactions with docker push/pull mechanism
	* critical interactions with the way docker stores images locally

* In what way will the result be customizable?
	* transport plugins allowing for radically different transport methods (bittorent, direct S3 access, etc)
	* extensibility design for the registry allowing for complex integrations with other systems
	* backend storage drivers API


## Kick-off output

**What is the expected output of the kick-off session?**

* draft specifications
* separate binary tool for demo purpose
* a mergeable PR that fixes 90% of the listed issues


* agree on a vision that allows solving all that are deemed worthy
* propose a long term battle plan with clear milestones that encompass all these
* define a first milestone that is compatible with the future and does already deliver some of the solutions
* deliver the specifications for image manifest format and transport API
* deliver a working implementation that can be used as a drop-in replacement for the existing v1 with an equivalent feature-set

**How is the output going to be demoed?**

docker pull
docker push

**Once demoed, what will be the path to shipping?**

A minimal PR that include the first subset of features to make docker work well with the new server side components.

## Pressing matters

 * need a codename (ship, distribute)
 * new repository
 * new domains

 * architecture / OS
 * persistent ids
 * registries discovery
 * naming (quay.io/foo/bar)
 * mirroring



## Assorted issues

 * some devops want a docker engine that cannot do push/pull