Prior to this commit, change tracking in sbt 1.3.0 was done via the
changed(Input|Output)Files tasks which were tasks returning
Option[ChangedFiles]. The ChangedFiles case class was defined in io as
case class ChangedFiles(created: Seq[Path], deleted: Seq[Path], updated: Seq[Path])
When no changes were found, or if there were no previous stamps, the
changed(Input|Output)Files tasks returned None. This made it impossible
to tell whether nothing had changed or if it was the first time.
Moreover, the api was awkward as it required pattern matching or folding
the result into a default value.
To address these limitations, I introduce the FileChanges class. It can
be generated regardless of whether or not previous file stamps were
available. The changes contains all of the created, deleted, modified
and unmodified files so that the user can directly call these methods
without having to pattern match.
It is tedious to write (foo / allInputFiles).value so I added simple
extension method macros that expand `foo.inputFiles` to
(foo / allInputFiles).value and `foo.outputFiles` to
`(foo / allOutputFiles).value`.
While writing documentation for the new file management/incremental
task evaluation features, I realized that incremental task evaluation
did not have the correct semantics. The problem was that calls to
`.previous` are not scoped within the current task. By this, I mean that
say there are tasks foo and bar and that the defintion of bar looks like
bar := {
val current = foo.value
foo.previous match {
case Some(v) if v == current => // value hasn't changed
case _ => process(current)
}
}
The problem is that foo.previous is stored in
effectively (foo / streams).value.cacheDirectory / "previous". This
means that it is completely decoupled from foo. Now, suppose that the
user runs something like:
> set foo := 1
> bar // processes the value 1
> set foo := 2
> foo
> bar // does not process the new value 2 because foo was called, which updates the previous value
This is not an unrealistic scenario and is, in fact, common if the
incremental task evaluation is changed across multiple processing steps.
For example, in the make-clone scripted test, the linkLib task processes
the outputs of the compileLib task. If compileLib is invoked separately
from linkLib, then when we next call linkLib, it might not do anything
even if there was recompilation of objects because the objects hadn't
changed since the last time we called compileLib.
To fix this, I generalizedthe previous cache so that it can be keyed on
two tasks, one is the task whose value is being stored (foo in the
example above) and the other is the task in which the stored task value
is retrieved (bar in the example above). When the two tasks are the
same, the behavior is the same as before.
Currently the previous value for foo might be stored somewhere like:
base_directory/target/streams/_global/_global/foo/previous
Now, if foo is stored with respect to bar, it might be stored in
base_directory/target/streams/_global/_global/bar/previous-dependencies/_global/_gloal/foo/previous
By storing the files this way, it is easy to remove all of the previous
values for the dependencies of a task.
In addition to changing how the files are stored on disk, we have to store
the references in memory differently. A given task can now have multiple
previous references (if, say, two tasks bar and baz both depend on the
previous value). When we complete the results, we first have to collect
all of the successful tasks. Then for each successful task, we find all
of its references. For each of the references, we only complete the
value if the scope in which the task value is used is successful.
In the actual implemenation in Previous.scala, there are a number places
where we have to cast to ScopedKey[Task[Any]]. This is due to
limitations of ScopedKey and Task being type invariant. These casts are
all safe because we never try to get the value of anything, we only use
the portion of the apis of these types that are independent of the value
type. Structural typing where ScopedKey[Task[_]] gets inferred to
ScopedKey[Task[x]] forSome x is a big part of why we have problems with
type inference.
Using List instead of vector makes the code a bit more readable. We
don't need indexed access into the data structure so its unlikely that
Vector was providing any performance benefit.
As part of a documentation push, I noticed that these were undocumented
and that there were some public apis in FileStamp that I intended to be
private[sbt].
I realized it was probably not ideal to have these implicit JsonFormats
defined directly in the FileStamp object because they might
inadvertently be brought into scope with a wildcard import.
I realized that it didn't make sense to specify these just for linux
even though they work fine with the version of gcc that runs on my mac.
I also ran scalafmt on this file.
I noticed that sometimes if I changed a build source and then ran reload
in the shell, I'd still see a warning about build sources having
changed. We can eliminate this behavior by resetting the
hasCheckedMetaBuild state attribute to false and skipping the
checkBuildSources step if the current command is 'reload'. We also now
skip checking the build source step if the command is exit or reboot.