2772
Comment: +added/removed/moved files
|
7168
Added detailed outline of how a changeset is committed to the repo.
|
Deletions are marked like this. | Additions are marked like this. |
Line 1: | Line 1: |
<<Include(A:delete)>> |
|
Line 2: | Line 4: |
Line 5: | Line 6: |
A '''changeset''' (sometimes abbreviated "cset") is an atomic collection of changes to files in a [:Repository:repository]. The act of creating a changeset is usually called a [:Commit:commit] or checkin. The information in a changeset includes | A '''changeset''' (sometimes abbreviated "cset") is an atomic collection of changes to files in a [[Repository|repository]]. It contains all recorded [[LocalModifications|local modfication]] that lead to a new [[Revision|revision]] of the repository. |
Line 7: | Line 8: |
* changes to the contents of the files * added/removed/moved files * changes to file names or other external attributes (such as execute permissions) |
A changeset is identified uniquely by a [[ChangeSetID|changeset ID]]. In a single repository, you can identify it using a [[RevisionNumber|revision number]]. The act of creating a changeset is called a commit or checkin. A changeset includes the actual changes to the files and some meta information. The meta information in a changeset includes: * the [[Nodeid|nodeid]] of its [[Manifest|manifest]] * the list of changed files |
Line 11: | Line 15: |
* the name of the branch ("default", if omitted or not set) | * the name of the [[Branch|branch]] ("default", if omitted or not set) |
Line 13: | Line 17: |
Each changeset has zero, one or two [:Parent:parent] changesets. It has two parent changesets, if the commit was a [:Merge:merge]. It has no parent, if the changeset is a root in the repository. There may be multiple roots in a repository (normally, there is only one), each representing the start of a branch. | Each changeset has zero, one or two parent changesets. It has two parent changesets, if the commit was a [[Merge|merge]]. It has no parent, if the changeset is a root in the repository. There may be multiple roots in a repository (normally, there is only one), each representing the start of a branch. |
Line 15: | Line 19: |
If a changeset is not the [:Head:head] of a branch, it has one or more child changesets (it is then the parent of its child changesets). | If a changeset is not the [[Head|head]] of a branch, it has one or more child changesets (it is then the parent of its child changesets). |
Line 17: | Line 21: |
Technically, the parent changesets of a changeset are retrieved from the revision history of the changelog file(s) in the repository (files {{{00changelog.i}}} and {{{00changelog.d}}} in {{{.hg/store}}}). Each changeset references a revision of the [:Manifest:manifest] (see ["Design"] for the technical details). | The [[WorkingDirectory|working directory]] can be [[Update|updated]] to any commited changeset of the repository, which then becomes the parent of the working directory. |
Line 19: | Line 23: |
The [:WorkingDirectory:working directory] can be [:Update:updated] to any commited changeset of the repository, which then becomes the parent of the working directory. | "Updating" back to a changeset which already has a child, changing files and then committing creates a new child changeset, thus starting a new branch. Branches can be [[NamedBranches|named]]. |
Line 21: | Line 25: |
Committing changes in the working directory creates a new revision in the manifest and a new changeset (a new revision in the changelog). The parent(s) of the working directory become the parents of the new changeset and the new changeset becomes the new parent of the working directory. | All changesets of a repository are stored in the changelog. |
Line 23: | Line 27: |
"Updating" back to a changeset which already has a child, changeing files and then committing creates a new child changeset, thus starting a new branch. Branches can be [:NamedBranches:named]. | Here's what the internal representation of a changeset looks like: |
Line 25: | Line 29: |
A changeset is identified uniquely by a [:ChangeSetID:changeset ID]. In a single repository, you can identify it using a [:RevisionNumber:revision number]. | {{{ $ hg debugdata .hg/00changelog.d 1208 1102691ceab8c8f278edecd80f2e3916090082dd <- the corresponding manifest nodeid mpm@selenic.com <- the committer 1126146623 25200 <- the date, in seconds since the epoch, and seconds offset from UTC mercurial/commands.py <- the list of changed files, followed by the commit message |
Line 27: | Line 36: |
* Question: Is a changeset a particular state of the project (like a Subversion revision number), or is it a set of changes to files (like a Darcs patch)? * The way the changeset hash is calculated says that a changeset is a particular state of the project plus all of its ancestor states (i.e. all the changeset it took to get there). In Darcs that's a [http://www.darcs.net/manual/node7.html#SECTION00781000000000000000 tag]. |
Clean up local clone file list |
Line 30: | Line 38: |
See also: ["ChangeSetComments"] | We now use an explicit list of files to copy during clone so that we don't copy anything we shouldn't. }}} == Committing a new changeset == Committing a changeset to the repository involves updating the [[Revlog|Revlogs]] for all modified files, the [[http://mercurial.selenic.com/wiki/Manifest|Manifest]], and the [[http://mercurial.selenic.com/wiki/ChangeLog|Changelog]].The following outlines the process of committing a new changeset to a repository, which is a two-stage process. The first stage walks from top to bottom, from the changelog, to the manifest, to the files. The second stage goes back up, from the files, to the manifest, to the changelog. === First stage (top to bottom) === The first step is to get the [[http://mercurial.selenic.com/wiki/ChangeLog|Changelog]] of the parent revision. The changelog is a virtual file, in that it doesn't necessarily exist directly as a file in the repository. Instead, it is versioned in a [[Revlog]], just like all of your tracked files. From the revlog, any version of the Changelog can be constructed on the fly, as needed. The changelog has one version (one entry in its revlog) for every revision of the repository. Each version of the changelog stores meta information about the revision, including a timestamp for the commit, the username that made the commit, and the commit log. The most important thing it stores is a [[http://mercurial.selenic.com/wiki/Nodeid|Nodeid]] which indicates a specific version of the [[http://mercurial.selenic.com/wiki/Manifest|Manifest]]. Like the changelog, the manifest is a versioned virtual file. It has its own revlog, and the nodeid specified in the changelog uniquely identifies one of the entries in the manifests revlog (i.e., a specific version of the manifest), so the second step is to take the nodeid indicated in the changelog and fetch that version of the manifest. Remember, this is the version of the manifest used by the ''parent'' revision. Each version of the manifest is like a snapshot of the files in the repository at a given moment (i.e., in a particular revision of the repository). The manifest doesn't store the contents of the files directly, instead it stores a [[http://mercurial.selenic.com/wiki/Nodeid|Nodeid]] for each tracked file. Just like the manifest nodeid stored in the changelog, each nodeid in the manifest indicates a particular version of that file (i.e., a particular entry in the file's revlog). The third and final step of the first stage is to get the revision specified in the manifest for each file that has been modified in this changeset. These are the ''parent versions'' of the files. Notice that this first stage is basically just updating the repository to the parent revision, except that nothing is actually changed on disk or in the filesystem, the update is created virtually and kept in memory. === Second stage (bottom to top) === With all of the parent versions identified and reconstructed for the changelog, the manifest, and all the modified files, the second stage can begin to construct new entries for each of the effected revlogs. The first step in creating a revlog entry is to determine the new nodeid, which will uniquely identify that entry in the revlog. [[http://mercurial.selenic.com/wiki/Nodeid|Nodeids]] are constructed by hashing the nodeids of the two parent versions, and the complete contents of the new version of the file. Remember that the parent nodeids are not the same as the parent ChangeSetId. The parent nodeids are the identifiers for other entries in the same revlog. The ChangeSetId is only a parent id for the changelog, not for the manifest of the files. For the manifest, the parent nodeid is the one that was specified in the parent version of the changelog, and for the files, the parent nodeid is the one specified in the parent version of the manifest. The fact that the nodeid requires the complete contents of the new version of the file is the reason that the second stage needs to go from bottom to top. Nodeids for the new versions of the tracked files are computed first, then the manifest is updated with these new nodeids to create the new version of the manifest. With the new version of the manifest prepared, a new manifest node id can be computed. This at last allows us to generate the new changelog, and then the new nodeid for the changelog, whcih will be the ChangeSetId for the new revision of the repository. The final stage is to actually update the revlogs for the changelog, the manifest, and all the modified files. The reason this comes last is because each revlog entry incldues the ChangeSetId for the repository revision it corresponds to, and we didn't have this until the very end. See also: ChangeSetComments, [[Design]] |
Line 34: | Line 70: |
[[FrenchChangeSet|Français]] |
This page is proposed for deletion. See our wiki cleanup plan for more information. |
Changeset
(for a short intro of the basic concepts of Mercurial, see UnderstandingMercurial)
A changeset (sometimes abbreviated "cset") is an atomic collection of changes to files in a repository. It contains all recorded local modfication that lead to a new revision of the repository.
A changeset is identified uniquely by a changeset ID. In a single repository, you can identify it using a revision number.
The act of creating a changeset is called a commit or checkin. A changeset includes the actual changes to the files and some meta information. The meta information in a changeset includes:
- the list of changed files
- information about who made the change (the "committer"), why ("comments") and when (date/time, timezone)
the name of the branch ("default", if omitted or not set)
Each changeset has zero, one or two parent changesets. It has two parent changesets, if the commit was a merge. It has no parent, if the changeset is a root in the repository. There may be multiple roots in a repository (normally, there is only one), each representing the start of a branch.
If a changeset is not the head of a branch, it has one or more child changesets (it is then the parent of its child changesets).
The working directory can be updated to any commited changeset of the repository, which then becomes the parent of the working directory.
"Updating" back to a changeset which already has a child, changing files and then committing creates a new child changeset, thus starting a new branch. Branches can be named.
All changesets of a repository are stored in the changelog.
Here's what the internal representation of a changeset looks like:
$ hg debugdata .hg/00changelog.d 1208 1102691ceab8c8f278edecd80f2e3916090082dd <- the corresponding manifest nodeid mpm@selenic.com <- the committer 1126146623 25200 <- the date, in seconds since the epoch, and seconds offset from UTC mercurial/commands.py <- the list of changed files, followed by the commit message Clean up local clone file list We now use an explicit list of files to copy during clone so that we don't copy anything we shouldn't.
Committing a new changeset
Committing a changeset to the repository involves updating the Revlogs for all modified files, the Manifest, and the Changelog.The following outlines the process of committing a new changeset to a repository, which is a two-stage process. The first stage walks from top to bottom, from the changelog, to the manifest, to the files. The second stage goes back up, from the files, to the manifest, to the changelog.
First stage (top to bottom)
The first step is to get the Changelog of the parent revision. The changelog is a virtual file, in that it doesn't necessarily exist directly as a file in the repository. Instead, it is versioned in a Revlog, just like all of your tracked files. From the revlog, any version of the Changelog can be constructed on the fly, as needed.
The changelog has one version (one entry in its revlog) for every revision of the repository. Each version of the changelog stores meta information about the revision, including a timestamp for the commit, the username that made the commit, and the commit log. The most important thing it stores is a Nodeid which indicates a specific version of the Manifest.
Like the changelog, the manifest is a versioned virtual file. It has its own revlog, and the nodeid specified in the changelog uniquely identifies one of the entries in the manifests revlog (i.e., a specific version of the manifest), so the second step is to take the nodeid indicated in the changelog and fetch that version of the manifest. Remember, this is the version of the manifest used by the parent revision.
Each version of the manifest is like a snapshot of the files in the repository at a given moment (i.e., in a particular revision of the repository). The manifest doesn't store the contents of the files directly, instead it stores a Nodeid for each tracked file. Just like the manifest nodeid stored in the changelog, each nodeid in the manifest indicates a particular version of that file (i.e., a particular entry in the file's revlog).
The third and final step of the first stage is to get the revision specified in the manifest for each file that has been modified in this changeset. These are the parent versions of the files.
Notice that this first stage is basically just updating the repository to the parent revision, except that nothing is actually changed on disk or in the filesystem, the update is created virtually and kept in memory.
Second stage (bottom to top)
With all of the parent versions identified and reconstructed for the changelog, the manifest, and all the modified files, the second stage can begin to construct new entries for each of the effected revlogs. The first step in creating a revlog entry is to determine the new nodeid, which will uniquely identify that entry in the revlog. Nodeids are constructed by hashing the nodeids of the two parent versions, and the complete contents of the new version of the file. Remember that the parent nodeids are not the same as the parent ChangeSetId. The parent nodeids are the identifiers for other entries in the same revlog. The ChangeSetId is only a parent id for the changelog, not for the manifest of the files. For the manifest, the parent nodeid is the one that was specified in the parent version of the changelog, and for the files, the parent nodeid is the one specified in the parent version of the manifest.
The fact that the nodeid requires the complete contents of the new version of the file is the reason that the second stage needs to go from bottom to top. Nodeids for the new versions of the tracked files are computed first, then the manifest is updated with these new nodeids to create the new version of the manifest.
With the new version of the manifest prepared, a new manifest node id can be computed. This at last allows us to generate the new changelog, and then the new nodeid for the changelog, whcih will be the ChangeSetId for the new revision of the repository.
The final stage is to actually update the revlogs for the changelog, the manifest, and all the modified files. The reason this comes last is because each revlog entry incldues the ChangeSetId for the repository revision it corresponds to, and we didn't have this until the very end.
See also: ChangeSetComments, Design