Closing the ManagedClassLoader generated by test can cause nonlocal
effects because the jdk shares some JarFile resources across multiple
URLClassLoaders. As a result, if one classloader is trying to load a
resource and the classloader is closed, it might cause the resource
loading to fail (see https://github.com/sbt/sbt/issues/5262). This can
be fixed by moving the scalatest framework jar (and its dependencies)
into an additional classloader layer that sits between the scala library
loader and the rest of the test dependencies.
In addition to adding the new layer, I reworked the
ReverseLookupClassLoader to use its dependent classloader to find
resources that may below it in the classloading hierarchy rather than
constructing an entirely new classloader for resources.
After this change, I was able to run test in the repro project:
https://github.com/rjmac/sbt-5262 1000 times with no failures. Note that
the repro is sensitive to the jdk used. I could not reproduce with jdk11
but I could typically induce a failure within 20 or so runs with jdk8.
I benchmarked this change with
https://github.com/eatkins/scala-build-watch-performance and performance
was roughly the same as 1.3.4 with turbo mode and about 200-250ms faster
in non-turbo mode (which can be explained by the time to load the
scalatest classes).
This makes it possible to do mkIvyConfiguration.value.withXXX(...) for
all the methods in InlineIvyConfiguration. (I need this to remove
inter-project resolvers when fetching dotty from sbt-dotty to avoid
accidentally fetching a local project in the build of dotty itself).
The previous implementation of ZombieClassLoader was not thread safe.
This caused problems because it is possible for the ManagedClassLoader
in test to leak into the coursier thread pool if the test uses bouncy
castle apis. Unfortunately, these apis seem to in some cases assign
static variables using the Thread context class loader. Because the
bouncycastle apis are implemented by the jdk, they are found in the
system classloader and thus the static references leak out of the test
context.
I had a local repro of https://github.com/sbt/sbt/issues/5249 that is
fixed by this change.
I found it hard to reason about where certain local variables, like
currentRef, were coming from. I also changed 'x' to 'extracted' in a few
places for clarity as well.
Rather than putting the background job temporary files in whatever
java.io.tmpdir points to, this commit moves the files into a
subdirectory of target in the project root directory.
To make the directory configurable via settings, I had to move the
declaration of the bgJobService setting later in the project
initialization process. I don't think this should matter because
background jobs shouldn't be created until after the project has loaded
all of its settings..
When a user calls sbt exit and there is an active background job, sbt
may not exit cleanly. This was primarily because the
background job service shutdown method depended on the
StandardMain.executionContext which was closed before the background job
service was shutdown. This was fixable by reordering the resource
closing in StandardMain.runManaged.
When running a main method, if the user inputs ctrl+c then the `run`
task will exit but the main method is not interrupted so it continues
running even once sbt has returned to the shell. If the main method is a
webserver, this prevents run from ever starting again on a fixed port.
To fix this, we can modify the waitForTry method to stop the job if an
exception is thrown (ctrl+c leads to an interrupted exception being
thrown by waitFor).
I rework the BackgroundJobService so that the default implementation of
waitForTry is now usable and no longer needs to be overridden. The side
effect of this change is that waitFor may now throw an exception. Within
sbt, waitFor was only used in one place and I reworked it to use
waitForTry instead. This could theoretically break a downstream user who
relied on waitFor not throwing an exception but I suspect that there
aren't many users of this api, if any at all.
The StateTransform class introduced in
9cdeb7120e did not cleanly integrate with
logic for transforming the state using the `transformState` task
attribute. For one thing, the state transform was only applied if the
root node returned StateTransform but no transformation would occur if a
task had a dependency that returned StateTransform. Moreover, the
transformation would override all other transformations that may have
occurred during task evaluation.
This commit updates StateTransform to act very similarly to the
transformState attribute. Instead of wrapping a `State` instance, it now
wraps a transformation function from `State => State`. This function
can be applied on top of or prior to the other transformations via the
`transformState`.
For binary compatibility with 1.3.0, I had to add the stateProxy
function as a constructor parameter in order to implement the `state`
method. The proxy function will generally throw an exception unless the
legacy api is used. To avoid that, I private[sbt]'d the legacy api so
that binary compatibility is preserved but any builds targeting > 1.4.x
will be required to use the new api.
Unfortunately I couldn't private[sbt] StateTransform.apply(state: State)
because mima interpreted it as a method type change becuase I added
StateTransform.apply(transform: State => State). This may be a mima bug
given that StateTransform.apply(state: State) would be jvm public even
when private[sbt], but I figured it was quite unlikely that any users
were using this method anyway since it was incorrectly implemented in
1.3.0 to return `state` instead of `new StateTransform(state)`.
The linting can take a while for large projects because `Def.compiled`
scales in the number of settings. Even for small projects (i.e. scripted
tests), it takes about 50 ms on my computer. This doesn't change the
current behavior because the default value is true.
This PR includes the values of the `description` and `homepage`
settings into the `ivy.xml` files generated by the `makeIvyXml`
task. It restores the behaviour of sbt 1.2.8 and if `useCoursier`
is set to `false`.
Two things are changed in this PR:
* `IvyXml.content` now adds the `homepage` attribute to the
`description` element if `project.info.homePage` is not empty.
* `CoursierInputsTasks.coursierProject0` now fills the previous
empty `CProject.info` field with the description and homepage.
Closes: #5234
Ref #4211Fixes#4395Fixes#4600
This is a reimplementation of `--addPluginSbtFile`. #4211 implemented the command to load extra `*.sbt` files as part of the global plugin subproject. That had the unwanted side effects of not working when `.sbt/1.0/plugins` directory does not exist. This changes the strategy to load the `*.sbt` files as part of the meta build.
```
$ sbt -Dsbt.global.base=/tmp/hello/global --addPluginSbtFile=/tmp/plugins/plugin.sbt
[info] Loading settings for project hello-build from plugin.sbt ...
[info] Loading project definition from /private/tmp/hello/project
sbt:hello> plugins
In file:/private/tmp/hello/
sbt.plugins.IvyPlugin: enabled in root
sbt.plugins.JvmPlugin: enabled in root
sbt.plugins.CorePlugin: enabled in root
sbt.ScriptedPlugin
sbt.plugins.SbtPlugin
sbt.plugins.SemanticdbPlugin: enabled in root
sbt.plugins.JUnitXmlReportPlugin: enabled in root
sbt.plugins.Giter8TemplatePlugin: enabled in root
sbtvimquit.VimquitPlugin: enabled in root
```
The current injection of the new nio keys will overwrite any definitions
of those keys in a build source. This is undesirable. The fix is to
create a mapping of scoped keys to settings and for each inject setting
key, if there is a previous key, put that definition after the injected
definition so that it can override it.
It is still possible for progress threads to leak so shut them down if
there are no active tasks. The report0 method will start up a new thread
if a task is added.
In some circumstances, sbt would generate a number of task progress
threads that could run concurrently. The issue was that the TaskProgress
could be shared by multiple EvaluateTaskConfigs if a dynamic task was
used. This was problematic because when a dynamic task completed, it
might call afterAllCompleted which would stop the progress thread. There
also was a race condition because multiple threads calling initial could
theoretically have created a new progress thread which would cause a
resource leak.
To fix this, we modify the shared task progress so that the `stop()`
method is a no-op. This should prevent dynamic tasks from stopping the
progress thread. We also defer the creation of the task thread until
there is at least one active task. This prevents a thread from being
created in the shell.
The motivation for this change was that I found that sometimes there was
a leaked progress thread that would make the shell not really work for
me because the progress thread would overwrite the shell prompt. This
change fixes that behavior and I was able to validate with jstack that
there was consistently either one or zero task progress threads at a
time (zero in the shell, one when tasks were actually running).
When running sbt -Dtask.timings=true, the task timings get printed to
the console which can overwrite the shell prompt. When we use a logger,
the timing lines are correctly separated from the prompt lines.
The completions were generating page numbers that didn't make sense if
there were a small number of scripted tests. For example, suppose that
there were only two tests defined, it would generate *1of3 *2of3 and
*3of3 completions even though there weren't even three tests.
The way clean was implemented, it was running `clean`, `ivyModule` and
`streams` concurrently. This was problematic because clean could blow
away files needed by `ivyModule` and `streams`. To fix this, move the
cleanCachedResolutionCache into a separate task and run that before the
normal clean.
Should fix https://github.com/sbt/sbt/issues/5067.
Fixes https://github.com/sbt/sbt/issues/3183
This implements an input task lintBuild that checks for unused settings/tasks.
Because most settings are on the intermediary to other settings/tasks, they are included into the linting by default. The notable exceptions are settings used exclusively by a command. To opt-out, you can either append it to `Global / excludeLintKeys` or set the rank to invisible.
On the other hand, many tasks are on the leaf (called by human), so task keys are excluded from linting by default. However there are notable tasks that trip up users, so they are opted-in using `Global / includeLintKeys`.
I noticed that when entering the console, I'd often be left with a
supershell line at the bottom of the screen that would eventually get
interlaced with my console commands. This can be eliminated by clearing
the supershell progress before evaluating the task if it is one of the
skip tasks.
Fixes https://github.com/sbt/sbt/issues/1673
There's been report of intermittent "Could not create directory" error related to "classes.bak." retronym identified that all configurations are using the same directory, and that might be the cause of race condition.
This addresses the issue by assigning a unique directory for each configuration.
These classloaders which are created if sbt is launched with a legacy
launcher (or one that doesn't follow the current classloading hierarchy
convention), were implemented in scala, but that meant that they were
not parallel capable. I fix that by moving the implementations to java.
I also move the static method that creates a MetaBuildLoader into the
java class.
A number of users were reporting issues with deadlocking when using
1.3.2: https://github.com/sbt/sbt/issues/5116. This seems to be because
most of the sbt created classloaders were not actually parallel capable.
In order for a classloader to be registered as a parallel capable, ALL
of the parent classes except for object in the class hierarchy must be
registered as a parallel capable:
https://docs.oracle.com/javase/8/docs/api/java/lang/ClassLoader.html#registerAsParallelCapable--.
If a classloader is not registered as parallel capable, then a global
lock will be used internally for classloading and this can lead to deadlock.
It is impossible to register a scala 2 classloader as parallel capable
so I ported all of the classloaders to java.
This commit updates the java-serialization scripted test. Prior to the
port, the new version of the test would more or less always deadlock.
After this change, I haven't been able to reproduce a deadlock.
This had no significant performance impact when I reran
https://github.com/eatkins/scala-build-watch-performance
Fixes#1458
Running sbt from `/` results to sbt getting stuck trying to load the directories recursively, and eventually erroring with a java.lang.OutOfMemoryError (after freezing for a long time) even on an Alpine container.
To prevent it, this adds a check to see if the absolute path is `/` or not.
```
/ $ sbt -Dsbt.version=1.4.0-SNAPSHOT
[error] java.lang.IllegalStateException: cannot run sbt from root directory without -Dsbt.rootdir=true; see sbt/sbt#1458
[error] Use 'last' for the full log.
```
The message:
```
Project loading failed: (r)etry, (q)uit, (l)ast, or (i)gnore?
```
is not explicit about retry being the option used when pressing return.
I don't think that dummy tasks really make sense for task progress
because they are evaluated outside of the normal task evaluation. This
came up because I was seeing streams-manager in supershell which didn't
seem useful.
I noticed some flickering in super shell progress lines and realized
that it was because there were multiple progress threads running
concurrently. This is problematic because each thread has a completely
different state so if each thread has an active task, the display will
flicker between the two tasks. I think this is caused primarily by
dynamic tasks. At least the example where I was seeing it was caused by
a dynamic task.
If a project had a meta-meta build (project/project), the build sources
in the project directory were ignored. This was because the projectGlobs
method did not correctly handle recursion. It inadvertently
discarded the accumulator globs and only returned the most recently
generated globs. This commit fixes that and adds a regression test to
the nio/reload scripted test.
I was looking into sbt start up time and in profiling was able to
identify a number of classloading bottlenecks. To speed up
initialization, we can preload those classes in the background. I saw
average speedups of roughly .75 seconds after this change. Also, the `time`
command would consistently report cpu system time very close to 400% and
I have 4 cores on my laptop. With 1.3.0 it would be more like 350%.
There have been a number of complaints about the new classloader closing
behavior. It is too aggressive about closing classloaders after test and
run. This commit softens the behavior by allowing a classloader to be
resurrected after close by creating a new zombie classloader that has
the same urls as the original classloader. After this commit, we always
close the classloaders when we are done, but they can still leak
file descriptors if a zombie is created.
To configure the behavior, I add the allowZombieClassLoaders key. If it
is false (which is default), we will warn but still allow them. If it
is true, then we silence the warning. In a later version of sbt, we can
change the semantics to be strict.
I verified after this change that I could add a shutdown hook in `run`
and it would be evaluated so long as I set `bgCopyClasspath := false`.
Otherwise the needed jars were deleted before the hooks could run.
Bonus: delete unused ResourceLoaderImpl class
Fixes#5070
This adds a new setting called `includePluginResolvers` (default `false`).
When set to `true`, it the project will include resolvers from the metabuild.
This allows the build user to declare a resolver in one place (`project/plugins.sbt`) that gets applied to both the metabuild as well as all the subprojects. The scenario comes up when someone distributes a software on their own repo. Ref #4103
We want to recursively monitor the project meta build, but we also want
to avoid listing directories that don't exist. To compromise, I rework
the buildSourceFileInputs to add the nested project directories if they
exist. Because the fileInputs are a setting, this means that adding a
new project directory and *.sbt or *.scala will not immediately trigger
a rebuild, but in most common cases, it will. I added a scripted test
for this.
In sbt 1.3.0, we only monitor build sources in the root project
directory and the root project meta build directory. This commit adds
these inputs for each project.
Fixes https://github.com/sbt/sbt/issues/5061.
This was an oversight that caused consoleQuick to not work with
supershell. We should probably try to figure out a way to allow custom
tasks to black list themselves from super shell reporting.
In https://github.com/sbt/sbt/issues/5075 we realized that sbt 1.3.0
introduces a regression where it closes the classloader used to invoke
the main method for in process run before all of its non-daemon threads
have terminated. To fix this and still close the classloader, I add a
method, runInBackgroundWithLoader that provides the background job
services with an optional classloader that it can close after the job
completes.
This cleanly merges and works with 1.3.x as well.
Fixes https://github.com/sbt/sbt/issues/5047
When setting swoval.tmpdir via globalBase, changed to set globalBase as absolute path.
`com.swoval.runtime.NativeLoader.loadPackaged` uses `java.lang.System.load`.
It requires absolute path, so we should set `swoval.tmpdir` with absolute path.
During refactoring, these warnings got out of date. I also added
scaladoc to the watchTriggeredMessage key.
Ref: https://github.com/sbt/sbt/issues/5051.
To avoid reliance on jvm global variables, we need to share the super
shell state with each of the console appenders that write to the console
out. We only set the progress state for the console appenders for the
screen. This prevents messages that are below the global logging level
from modifying the progress state without preventing them from being
written to other appenders.
The ability to set the ProgressState for each of the console appenders
is added in a companion util PR.
I verified that the test output of io/test was correctly written to the
streams after this change (there were no progress lines in the output).
Disabling supershell when color mode is disabled is a sensible default
(especially for piped output). However, I think it should still be
possible to use supershell in no color mode.
This requires a util change that also enables supershell in no color
mode.
It is redundant and slow to restamp all of the dependency classpath
files when they have likely already been stamped by a subproject.
For the classfiles of subprojects, we fill the managedFileStampCache
with the values returned by the zinc compile analysis product stamps.
This is why they are probably already in the managed cache and should be
up to date so long as zinc is working correctly.
I noticed that various outputFileStamps tasks were showing up in the
task timing report when I ran Test / definedTests in the main sbt project.
That task became about 400ms faster after this change.
I noticed that the reports generated when using sbt.task.timings=true
made very little sense. They were displaying timings for tests that
couldn't possibly have been run. I tracked this down to the TaskTimings
be stored in the progressReport setting which meant they were reused
across multiple task runs. After this change, the reports made a lot
more sense.
The tab completions for scripted have long been broken. They display a
number of non-sensical pages like '*0of9' or '*1of0'. Some of the
multiparser changes seem to have caused these invalid
In 5eab9df0df, I updated the
outputFileStamps task to compute all of the stamps for a directory
recursively if an output file is a directory. Prior to that, it had only
computed the stamp for the directory itself. This caused a significant
performance regression in creating the test classloader because it was
computing the last modified time for all of the classfiles in the class path.
The test for 5000 source files in
https://github.com/eatkins/scala-build-watch-performance was running roughly
400ms slower due to this regression.
Ref https://github.com/sbt/sbt/issues/4905
This is a companion PR to https://github.com/sbt/librarymanagement/pull/318.
This will print the following warnings:
```
sbt:hello> compile
[warn] insecure HTTP request is deprecated 'Artifact(jsoup, jar, jar, None, Vector(), Some(http://jsoup.org/packages/jsoup-1.9.1.jar), Map(), None, false)'; switch to HTTPS or opt-in using from(url(...), allowInsecureProtocol = true) on ModuleID or .withAllowInsecureProtocol(true) on Artifact
[warn] insecure HTTP request is deprecated 'http://repo.typesafe.com/typesafe/releases/'; switch to HTTPS or opt-in as ("Typesafe Releases" at "http://repo.typesafe.com/typesafe/releases/").withAllowInsecureProtocol(true)
[warn] insecure HTTP request is deprecated 'http://repo.typesafe.com/typesafe/releases/'; switch to HTTPS or opt-in as ("Typesafe Releases" at "http://repo.typesafe.com/typesafe/releases/").withAllowInsecureProtocol(true)
[warn] insecure HTTP request is deprecated 'http://repo.typesafe.com/typesafe/releases/'; switch to HTTPS or opt-in as ("Typesafe Releases" at "http://repo.typesafe.com/typesafe/releases/").withAllowInsecureProtocol(true)
[warn] insecure HTTP request is deprecated 'Patterns(ivyPatterns=Vector(), artifactPatterns=Vector(http://repo.typesafe.com/typesafe/releases/[organisation]/[module](_[scalaVersion])(_[sbtVersion])/[revision]/[artifact]-[revision](-[classifier]).[ext]), isMavenCompatible=true, descriptorOptional=false, skipConsistencyCheck=false)'; switch to HTTPS or opt-in as Resolver.url("Typesafe Ivy Releases", url(...)).withAllowInsecureProtocol(true)
[warn] insecure HTTP request is deprecated 'Patterns(ivyPatterns=Vector(), artifactPatterns=Vector(http://repo.typesafe.com/typesafe/releases/[organisation]/[module](_[scalaVersion])(_[sbtVersion])/[revision]/[artifact]-[revision](-[classifier]).[ext]), isMavenCompatible=true, descriptorOptional=false, skipConsistencyCheck=false)'; switch to HTTPS or opt-in as Resolver.url("Typesafe Ivy Releases", url(...)).withAllowInsecureProtocol(true)
[warn] insecure HTTP request is deprecated 'Patterns(ivyPatterns=Vector(), artifactPatterns=Vector(http://repo.typesafe.com/typesafe/releases/[organisation]/[module](_[scalaVersion])(_[sbtVersion])/[revision]/[artifact]-[revision](-[classifier]).[ext]), isMavenCompatible=true, descriptorOptional=false, skipConsistencyCheck=false)'; switch to HTTPS or opt-in as Resolver.url("Typesafe Ivy Releases", url(...)).withAllowInsecureProtocol(true)
```
I inadvertently changed the semantics of clean so that cleanFiles would
only delete the file if it was a regular file. In older versions of sbt,
if a file in cleanFiles was a directory, it would be recursively
deleted.
During refactoring of Continuous, I inadvertently changed the semantics
of `~` so that all multi commands were run regardless of whether or not
an earlier command had failed. I fixed the issue and added a regression
test.
It was reported in https://github.com/sbt/sbt/issues/4973 that the
scalaVersion setting was not being correctly set in a script running
with ScriptMain using 1.3.0-RC4. Using git bisect, I found that the
issue was introduced in
73cfd7c8bd.
That commit manipulates the classloaders passed in by the launcher, but
only for the xMain entry point. I found that the script ran correctly if
I updated the classloader for ScriptedMain as well.
After these changes, the example script in #4973 correctly prints 2.13.0
for the scala version with a locally published sbt.
Bonus: rename xMainImpl object xMain. It was private[sbt] anyway.
https://github.com/sbt/sbt/issues/4986 reported that +compile would
always recompile everything in the project even when the sources hadn't
changed. This was because the dependency classpath was changing between
calls to compile, which caused the external hooks cache introduced in
32a6d0d5d7 to invalidate the scala
library. To fix this, I cache the file stamps on a per scala version
basis. I added a scripted test that checks that there is no
recompilation in two consecutive calls to `+compile` in a multi scala
version build. It failed prior to these changes.
I looked for serial bottlenecks in sbt project loading and discovered
that KeyIndex.aggregate was relatively easily parallelizable. Before
this time, it took about 1 second to run KeyIndex.aggregate in the akka
project on my computer. After this change, it took 250ms. Given that I
have 4 logical cores, the speedup is roughly linear.
It turns out that injecting the keys necessary for incremental tasks
causes a significant startup penalty for many larger projects. For
example, akka starts up about 3 seconds faster if do not inject these
settings for the tasks returning `File` or `Seq[File]`. Given that all
of these apis use java.nio.file anyway, it makes sense to not backport
them to older tasks.
The clean task got a lot slower in 1.3.0
(https://github.com/sbt/sbt/issues/4972). The reason for this was that
sbt 1.3.0 generated many custom clean tasks for any tasks that returned
`File` or `Seq[File]`. Each of these tasks was tagged with
Tags.Clean which meant that only one of them could run at a time. As a
result, it took a long time to evaluate all of the custom tasks, even if
they were no-ops. In the akka project, a no-op clean was taking 35
seconds which is simply unacceptable. After this change, a no-op clean
takes less than a second in akka (a full clean only takes about 6
seconds after running test:compile)
To fix this, I stopped aggregating the clean task across configs and
projects. Because I removed the aggregation, I needed to manually
implement clean in the `Compile` and `Test` configurations to make
`Compile / clean` and `Test / compile` clean work correctly.
Fixes#4964
Together with https://github.com/sbt/util/pull/211, this brings back stack trace supression for custom tasks by default.
Debug levels logs are available in `last`, and this prints a message informing the user of the fact. BLUE on dark background is difficult to read, so I am chaning the color hilight to MAGENTA.
After adding the automatic lookup to external hooks for missing binary
jars, the scripted test dependency-management/invalidate-internal
started failing. This was because the previous analysis contained a jar
dependency that still existed on disk but was no longer a part of the
dependency classpath. Fundamentally the problem is that the zinc
compile analysis is not tightly coupled with the sbt build state.
To fix this, we can cache the dependency classpath file stamps in the
same way that we cache the input file stamps in external hooks and
manually diff them at the sbt level. We then force updates regardless of
the difference between the zinc state and the sbt state.
It was reported that in community builds, sometimes there was
spurious over-compilation due to invalidation of the scala library jar
(https://github.com/sbt/sbt/issues/4948). The reason for this was that
the external hooks prefills the managed cache with all of the time
stamps for the project dependencies but was not looking up any jars that
weren't in the cache. I suspect I did this because I didn't realize that
zinc also includes its own classpath in the binaries which is not
a part of the dependencyClasspath. The fix is to just add the jar to the
cache if it doesn't already exist by switching to getOrElseUpdate from
get.
I followed the steps in #4948 and published a version of sbt locally
with this change and the spurious re-builds stopped.
In the code formatting use case, the formatting task may modify the
source files in place. If the formatting task uses the nio
inputFileStamps, then it would fill the in-memory cache of source paths
to file stamps. This would cause compile to see the pre-formatted
stamps. To fix this, we can invalidate the file cache entries for the
outputs of a task. This will cause the side-effect of some extra io
because the hashes may be computed three times: once for the format
inputs, once for the format outputs and once for the compile inputs. I
think most users would understand that adding auto-formatting would
potentially slowdown compilation.
To really prove this out, I implemented a poor man's scalafmt plugin in
a scripted test. It is fully incremental. Even in the case when some
files cannot be formatted it will update all of the files that can be
formatted and not re-format them until they change.
It makes sense to add a scope for the `fileTreeView` key where it is
available. At the moment, there is only one `fileTreeView`
implementation but, if that changes down the road, these tasks will
automatically inherit the correct view.
It makes sense for the new glob/nio based apis that we provide first
class support for filtering the results. Because it isn't possible to
scope a task within a task within a task, i.e.
`compile / fileInputs / includePathFilter`, I had to add four new
filter settings of type `PathFilter`:
fileInputIncludeFilter :== AllPassFilter.toNio,
fileInputExcludeFilter :== DirectoryFilter.toNio || HiddenFileFilter,
fileOutputIncludeFilter :== AllPassFilter.toNio,
fileOutputExcludeFilter :== NothingFilter.toNio,
Before I was effectively hard-coding the filter: RegularFileFilter &&
!HiddenFileFilter in the inputFileStamps and allInputFiles tasks. These
remain the defaults, as seen in the fileInputExcludeFilter definition
above, but can be overridden by the user.
It makes sense to exclude directories and hidden files for the input
files, but it doesn't necessarily make sense to apply any output filters
by default. For symmetry, it makes sense to have them, but they are
unlikely to be used often.
Apart from adding and defining the default values for these keys, the
only other changes I had to make was to remove the hard-coded filters
from the allInputFiles and inputFileStamps tasks and also add the
filtering to the allOutputFiles task. Because we don't automatically
calculate the FileAttributes for the output files, I added logic for
bypassing the path filter application if the PathFilter is effectively
AllPass, which is the case for the default values because:
AllPassFilter.toNio == AllPass
NothingFilter.toNio == NoPass
AllPass && !NoPass == AllPass && AllPass == AllPass
Prior to this commit, change tracking in sbt 1.3.0 was done via the
changed(Input|Output)Files tasks which were tasks returning
Option[ChangedFiles]. The ChangedFiles case class was defined in io as
case class ChangedFiles(created: Seq[Path], deleted: Seq[Path], updated: Seq[Path])
When no changes were found, or if there were no previous stamps, the
changed(Input|Output)Files tasks returned None. This made it impossible
to tell whether nothing had changed or if it was the first time.
Moreover, the api was awkward as it required pattern matching or folding
the result into a default value.
To address these limitations, I introduce the FileChanges class. It can
be generated regardless of whether or not previous file stamps were
available. The changes contains all of the created, deleted, modified
and unmodified files so that the user can directly call these methods
without having to pattern match.
It is tedious to write (foo / allInputFiles).value so I added simple
extension method macros that expand `foo.inputFiles` to
(foo / allInputFiles).value and `foo.outputFiles` to
`(foo / allOutputFiles).value`.
While writing documentation for the new file management/incremental
task evaluation features, I realized that incremental task evaluation
did not have the correct semantics. The problem was that calls to
`.previous` are not scoped within the current task. By this, I mean that
say there are tasks foo and bar and that the defintion of bar looks like
bar := {
val current = foo.value
foo.previous match {
case Some(v) if v == current => // value hasn't changed
case _ => process(current)
}
}
The problem is that foo.previous is stored in
effectively (foo / streams).value.cacheDirectory / "previous". This
means that it is completely decoupled from foo. Now, suppose that the
user runs something like:
> set foo := 1
> bar // processes the value 1
> set foo := 2
> foo
> bar // does not process the new value 2 because foo was called, which updates the previous value
This is not an unrealistic scenario and is, in fact, common if the
incremental task evaluation is changed across multiple processing steps.
For example, in the make-clone scripted test, the linkLib task processes
the outputs of the compileLib task. If compileLib is invoked separately
from linkLib, then when we next call linkLib, it might not do anything
even if there was recompilation of objects because the objects hadn't
changed since the last time we called compileLib.
To fix this, I generalizedthe previous cache so that it can be keyed on
two tasks, one is the task whose value is being stored (foo in the
example above) and the other is the task in which the stored task value
is retrieved (bar in the example above). When the two tasks are the
same, the behavior is the same as before.
Currently the previous value for foo might be stored somewhere like:
base_directory/target/streams/_global/_global/foo/previous
Now, if foo is stored with respect to bar, it might be stored in
base_directory/target/streams/_global/_global/bar/previous-dependencies/_global/_gloal/foo/previous
By storing the files this way, it is easy to remove all of the previous
values for the dependencies of a task.
In addition to changing how the files are stored on disk, we have to store
the references in memory differently. A given task can now have multiple
previous references (if, say, two tasks bar and baz both depend on the
previous value). When we complete the results, we first have to collect
all of the successful tasks. Then for each successful task, we find all
of its references. For each of the references, we only complete the
value if the scope in which the task value is used is successful.
In the actual implemenation in Previous.scala, there are a number places
where we have to cast to ScopedKey[Task[Any]]. This is due to
limitations of ScopedKey and Task being type invariant. These casts are
all safe because we never try to get the value of anything, we only use
the portion of the apis of these types that are independent of the value
type. Structural typing where ScopedKey[Task[_]] gets inferred to
ScopedKey[Task[x]] forSome x is a big part of why we have problems with
type inference.
Using List instead of vector makes the code a bit more readable. We
don't need indexed access into the data structure so its unlikely that
Vector was providing any performance benefit.
As part of a documentation push, I noticed that these were undocumented
and that there were some public apis in FileStamp that I intended to be
private[sbt].
I realized it was probably not ideal to have these implicit JsonFormats
defined directly in the FileStamp object because they might
inadvertently be brought into scope with a wildcard import.
I noticed that sometimes if I changed a build source and then ran reload
in the shell, I'd still see a warning about build sources having
changed. We can eliminate this behavior by resetting the
hasCheckedMetaBuild state attribute to false and skipping the
checkBuildSources step if the current command is 'reload'. We also now
skip checking the build source step if the command is exit or reboot.
Zinc records all of the compile source file hashes when compilation
completes. This is problematic because its possible that a source file
was changed during compilation. From the user perspective, this may mean
that their source change will not be recompiled even if a build is
triggered by the change.
To overcome this, I add logic in the sbt provided external hooks to
override the zinc analysis stamps. This is done by writing the source
file stamps to the previous cache after compilation completes. This
allows us to see the source differences from sbt's perspective, rather
than zinc's perspective. We then merge the combined differences in the
actual implementation of ExternalHooks. In some cases this may result in
over-compilation but generally over-compilation is preferred to under
compilation. Most of the time, the results should be the same.
The scripted test that I added modifies a file during compilation by
invoking a macro. It then effectively asserts that the file is
recompiled during the next test run by validating the compilation result
in the test. The test fails on the latest develop hash.
Changed resources were causing the dependency layer to be invalidated on
resource changes in turbo mode because the resource layer was in between
the scala library layer. This commit reworks the layers for the
AllDependencyJars strategy so that the top layer is able to load _all_
of the resources during a test run.
The resource layer was added to address the problem that dependencies
may need to be able to load resources from the project classpath but
wouldn't be able to do so if the dependencies were in a separate layer
from the rest of the classpath. The resource layer was a classloader
that could load any resource on the full classpath but couldn't load any
classes. When I added the resource layer, I was thinking that when
resources changed, the resource class loader needed to be invalidated.
Resources, however, are different from classes in that the same
ClassLoader can find the same resources in a different place because
getResource and getResourceAsStream just return locations but do not
actually do any loading.
Taking advantage of this, I add a proxy classloader for finding
resource locations to ReverseLookupClassLoader. We can reset the
classpath of the resource loader in
ReverseLookupClassLoaderHolder.checkout. This allows us to see the new
versions of the resources without invalidating the dependency layer.
The use of '$' in the path names for streams is a pain because, since
'$' is a special character in the shell, it makes it impossible to
directly copy and paste the paths. If we make this change, some builds
will be left with vestigial directories with $global and $build in them
until they run clean. It also would break any scripts that manually
delete these paths. That doesn't seem like a common use case, but it's
worth mentioning.
The use of the persistent file stamp cache between watch runs didn't
seem to cause any issues, but there was some chance for inconsistency
between the file stamp cache and the file system so it makes sense to
put it behind the turbo flag.
After changing the default, the watch/on-change scripted test started
failing. It turns out that the reason is that the file stamp cache
managed by the watch process was not pre-filled by task evaluation. For
this reason, the first time a source file was modified, it was treated
as a creation regardless of whether or not it actually was.
To fix this, I add logic to pre-fill the watch file stamp cache if we
are _not_ persisting the file stamps between runs.
I ran a before and after with the scala build performance benchmark tool
and setting the watchPersistFileStamps key to true reduced the median
run time by about 200ms in the non-turbo case.
There was a bug where sometimes a source file change would not trigger a
new build if the change occurred during a build. Based on the logs, it
seemed to be because a number of redundant events were generated for the
same path and they triggered the anti-entropy constraint of the file
event monitor.
To fix this, I consolidated a number of observers of the global file
tree repository into a single observer. This way, I am able to ensure
that only one event is generated per file event.
I also reworked the onEvent callback to only stamp the file once. It was
previously stamping the modified source file for all of the aggregated
tasks. In the sbt project running `~compile` meant that we were stamping
a source file 22 times whenever the file changed.
This actually uncovered a zinc issue though as well. Zinc computes and
writes the hash of the source file to the analysis file after
compilation has completed. If a source file is modified during
compilation, then the new hash is written to the analysis file even when
the compilation may have been made against the previous version of the
file. Zinc will then refuse to re-compile that file until another change
is made.
I manually verified that in the sbt project if I ran `~compile` before
this change and modified a file during compilation, then no event was
triggered (there was a log message about the event being dropped due to
the anti-entropy constraint though). After this change, if I changed a
file during compilation, it seemed to always trigger, but due to the
zinc bug, it didn't always re-compile.
This was causing an abstract type pattern T is unchecked since it is
eliminated by erasure. It was unneeded because store.get[T] return
Option[(T, Long)]. I'm surprised that the compiler complained about
this.
@olegych reported that sbt would silently swallow the 'compile' command
in the multi-command, 'run;compile;reload'. I tracked this down to the
build source check. When the build has
Global / onChangedBuildSource := ReloadOnSourceChanges, the check build
sources command return a new state with "reload" prefixed. To actually
perform the reload, I returned this modified state with the prefixed
reload command.
There were two problems with this:
1) In the auto-reload case, the current command was not run after the
reload
2) If the multi-command contained reload, the auto-reload check would
have a false positive which triggered the bug in (1)
To fix this, I clear out the remaining commands before I run the check
command. That way, we know that if the remaining commands has a reload,
then it is an auto-reload. We then prefix the state with both the reload
and the current command.
I updated the scripted test for auto-reload to handle multi commands
containing reload.
In this commit, I both restore some sbt 1.2.8 behavior and enhance the
api for setting keyboard shortcuts in watch. I change the default start
message to just show the watch count, the tasks that are being monitored
and, on a new line, the instructions to terminate the watch or show more
options.
Here's what it looks like:
[info] 1. Monitoring source files for spark/compile...
[info] Press <enter> to interrupt or '?' for more options.
?
[info] Options:
[info] <enter> : interrupt (exits sbt in batch mode)
[info] <ctrl-d> : interrupt (exits sbt in batch mode)
[info] 'r' : re-run the command
[info] 's' : return to shell
[info] 'q' : quit sbt
[info] '?' : print options
I also made it so that the new options can be added (and old options
removed) with the watchInputOptions key. For example, to add an option
to reload the build with the key 'l', you could add
ThisBuild / watchInputOptions += Watch.InputOption('l', "reload", Watch.Reload)
to your global build.sbt.
After adding that to my global ~/sbt/1.0/global.sbt file, the output of
'?' became:
[info] Options:
[info] <ctrl-d> : interrupt (exits sbt in batch mode)
[info] <enter> : interrupt (exits sbt in batch mode)
[info] '?' : print options
[info] 'l' : reload
[info] 'q' : quit sbt
[info] 'r' : re-run the command
[info] 's' : return to shell
At the suggestion of @eed3si9n, instead of specifying the file cache
size in bytes, we now specify it in a formatted string. For example,
instead of specifying 128 megabytes in bytes (134217728), we can specify
it with the string "128M".
It can be quite slow to read and parse a large json file. Often, we are
reading and writing the same file over and over even though it isn't
actually changing. This is particularly noticeable with the
UpdateReport*. To speed this up, I introduce a global cache that can be
used to read values from a CacheStore. When using the cache, I've seen
the time for the update task drop from about 200ms to about 1ms. This
ends up being a 400ms time savings for test because update is called for
both Compile / compile and Test / compile.
The way that this works is that I add a new abstraction
CacheStoreFactoryFactory, which is the most enterprise java thing I've
ever written. We store a CacheStoreFactoryFactory in the sbt State.
When we make Streams for the task, we make the Stream's
cacheStoreFactory field using the CacheStoreFactoryFactory. The
generated CacheStoreFactory may or may not refer to a global cache.
The CacheStoreFactoryFactory may produce CacheStoreFactory instances
that delegate to a Caffeine cache with a max size parameter that is
specified in bytes by the fileCacheSize setting (which can also be set
with -Dsbt.file.cache.size). The size of the cache entry is estimated by
the size of the contents on disk. Since we are generally storing things
in the cache that are serialized as json, I figure that this should be a
reasonable estimate. I set the default max cache size to 128MB, which is
plenty of space for the previous cache entries for most projects. If the
size is set to 0, the CacheStoreFactoryFactory generates a regular
DirectoryStoreFactory.
To ensure that the cache accurately reflects the disk state of the
previous cache (or other cache's using a CacheStore), the Caffeine cache
stores the last modified time of the file whose contents it should
represent. If there is a discrepancy in the last modified times (which
would happen if, say, clean has been run), then the value is read from
disk even if the value hasn't changed.
* With the following build.sbt file, it takes roughly 200ms to read and
parse the update report on my compute:
libraryDependencies += "org.apache.spark" %% "spark-sql" % "2.4.3"
libraryDependencies += "org.scalatest" %% "scalatest" % "3.0.1"
This is because spark-sql has an enormous number of dependencies and the
update report ends up being 3MB.
Fixes#4802
For Ivy integration sbt uses credential task in a peculiar way. 9fa25de84c/main/src/main/scala/sbt/Defaults.scala (L2271-L2275)
This lets the build user put `credential` task in various places, like metabuild or root project, but they would all act as if they were scoped globally. This PR adds `allCredentials` task to emulate the behavior to pass credentials into lm-coursier.
This Resolver had the same name as the typesafe ivy resolver specified
in the launcher boot.properties. It was creating a number of verbose
warnings about having multiple resolvers with the same name. I noticed
that the ivy pattern is slightly different for the boot resolver with
this name. It didn't seem to be causing any problems to have both
resolvers.
Fixes#4839