Fixes https://github.com/sbt/sbt/issues/3112
This unpacks Extracted as State's extension methods.
In addition this provides a way of responding via LSP.
csrResolvers.all evaluates all possible scopes in arbitrary order. This change
makes sure at least the project resolvers are placed before any resolvers from
dependency projects.
When sbt was entered through xMain.run and the classloaders do not have
the expected format, sbt recreates the classloaders for itself.
Unfortunately the extra classpath was not added to the classloader. This
caused project/extra to fail if it was entered from RunFromSourceMain
rather than with the launcher.
The main reason for having both the RunFromSourceMain and LauncherBased
scripted tests was that RunFromSourceMain would fail for any test that
ended up accessing the sbt.Package$ object. This commit fixes this bug
by reworking the classloader generated by RunFromSourceMain to invoke
sbt, switching from the classpath to jar classpath (by setting exportJars =
true) and entering sbt by calling `new xMain().run` rather than
`xMain.run`.
The reason for switching to the jar classpath is that the jvm seems to
have issues when there are two classes provided in different directories
that have the same case insensitive name, e.g. `sbt.package$` and
`sbt.Package$`. If those classes are instead provided in different
jars, the jvm seems to be able to handle it.
Exporting the jars is not enough though, I had to rework the
ClassLoader created in the launch method to have a layout that was
recognized by xMainConfiguration. I reimplemented the AppConfiguration
in java so that it could bootstrap itself in a single jar classloader
(the only needed jar is the Scripted.
If we export the jars in the build, then the NoClassDefErrors for
`sbt.Package$` go away during scripted tests using RunSourceFromMain.
This might make running tests in subprojects slightly slower but I think
its a worthy tradeoff.
In order for the sbt launcher to be able to resolve a local version of
sbt, we must publish the main jar, the sources jar, the doc jar, the pom
and an ivy.xml file. The publish and publishLocal tasks are wired in
IvyXml.scala to create an ivy.xml file before running publish. This
wasn't done with publishLocalBin which made it not work when no ivy.xml
file was already present (which was the case after running clean).
either create test reports with legacy file names (legacyreport=true) or with standard file names (legacyreport=false or omitted) but not both as suggested in #4451
Fixes https://github.com/sbt/sbt/issues/5339
It seems like some tests are using `ClassLoader#getResource("")` to acquire the `classes` directory path. This does not seem to work on sbt 1.3.6, which returns `file:/home/travis/.cache/coursier/v1/https/repo1.maven.org/maven2/org/apache/logging/log4j/log4j-api/2.11.2/log4j-api-2.11.2.jar!/META-INF/versions/9/`. To workaround this issue, I've switched to loading the known folder name instead.
In 53788ba356, I changed the cross multi
parser to issue all of the commands sequentially. This caused a
performance regression for many use cases:
https://github.com/sbt/sbt/issues/5321. This commit restores the old
behavior of `+` if the command to run has no arguments.
Ref https://github.com/sbt/sbt/issues/5314
Ref https://github.com/sbt/sbt/pull/5265
In sbt 1.3.4, it's possible to define a subproject named `client`.
The current parser behaves differently whether we calll `client/clean` or `client / clean` with whitespaces. The one with the whitespace invokes `client` command (as in thin client). This gets triggered by `+clean` because the new implementation uses whitespace.
There have been a number of issues that have come up because of sbt
1.3.0 aggressively closing classloaders. While these issues have been
quite useful in helping us determine some issues related to classloader
lifecycle, we should give users the option to prevent sbt from closing
the classloaders.
I also noticed that the classloader-cache/spark test has been
occasionally segfaulting on travis so I disable classloader closing in
that test.
Fixes https://github.com/sbt/sbt/issues/5063
This fixes "sbt new" on Ubuntu by restoring the terminal state after supershell querying for the terminal width.
sbt should not by default create files in the location specified by
java.io.tmpdir (which is the default behavior of apis like
IO.createTemporaryDirectory or Files.createTempFile) because they have a
tendency to leak and it also isn't even guaranteed that the user has
write permissions there (though this is unlikely). Doing so creates the
possibility for leaks
I git grepped for `createTemp` and found these apis. After this change,
the files created by sbt should largely be localized to the project and
sbt global base directories.
Closing the ManagedClassLoader generated by test can cause nonlocal
effects because the jdk shares some JarFile resources across multiple
URLClassLoaders. As a result, if one classloader is trying to load a
resource and the classloader is closed, it might cause the resource
loading to fail (see https://github.com/sbt/sbt/issues/5262). This can
be fixed by moving the scalatest framework jar (and its dependencies)
into an additional classloader layer that sits between the scala library
loader and the rest of the test dependencies.
In addition to adding the new layer, I reworked the
ReverseLookupClassLoader to use its dependent classloader to find
resources that may below it in the classloading hierarchy rather than
constructing an entirely new classloader for resources.
After this change, I was able to run test in the repro project:
https://github.com/rjmac/sbt-5262 1000 times with no failures. Note that
the repro is sensitive to the jdk used. I could not reproduce with jdk11
but I could typically induce a failure within 20 or so runs with jdk8.
I benchmarked this change with
https://github.com/eatkins/scala-build-watch-performance and performance
was roughly the same as 1.3.4 with turbo mode and about 200-250ms faster
in non-turbo mode (which can be explained by the time to load the
scalatest classes).
This makes it possible to do mkIvyConfiguration.value.withXXX(...) for
all the methods in InlineIvyConfiguration. (I need this to remove
inter-project resolvers when fetching dotty from sbt-dotty to avoid
accidentally fetching a local project in the build of dotty itself).
The previous implementation of ZombieClassLoader was not thread safe.
This caused problems because it is possible for the ManagedClassLoader
in test to leak into the coursier thread pool if the test uses bouncy
castle apis. Unfortunately, these apis seem to in some cases assign
static variables using the Thread context class loader. Because the
bouncycastle apis are implemented by the jdk, they are found in the
system classloader and thus the static references leak out of the test
context.
I had a local repro of https://github.com/sbt/sbt/issues/5249 that is
fixed by this change.
I found it hard to reason about where certain local variables, like
currentRef, were coming from. I also changed 'x' to 'extracted' in a few
places for clarity as well.
Rather than putting the background job temporary files in whatever
java.io.tmpdir points to, this commit moves the files into a
subdirectory of target in the project root directory.
To make the directory configurable via settings, I had to move the
declaration of the bgJobService setting later in the project
initialization process. I don't think this should matter because
background jobs shouldn't be created until after the project has loaded
all of its settings..
When a user calls sbt exit and there is an active background job, sbt
may not exit cleanly. This was primarily because the
background job service shutdown method depended on the
StandardMain.executionContext which was closed before the background job
service was shutdown. This was fixable by reordering the resource
closing in StandardMain.runManaged.
When running a main method, if the user inputs ctrl+c then the `run`
task will exit but the main method is not interrupted so it continues
running even once sbt has returned to the shell. If the main method is a
webserver, this prevents run from ever starting again on a fixed port.
To fix this, we can modify the waitForTry method to stop the job if an
exception is thrown (ctrl+c leads to an interrupted exception being
thrown by waitFor).
I rework the BackgroundJobService so that the default implementation of
waitForTry is now usable and no longer needs to be overridden. The side
effect of this change is that waitFor may now throw an exception. Within
sbt, waitFor was only used in one place and I reworked it to use
waitForTry instead. This could theoretically break a downstream user who
relied on waitFor not throwing an exception but I suspect that there
aren't many users of this api, if any at all.
The StateTransform class introduced in
9cdeb7120e did not cleanly integrate with
logic for transforming the state using the `transformState` task
attribute. For one thing, the state transform was only applied if the
root node returned StateTransform but no transformation would occur if a
task had a dependency that returned StateTransform. Moreover, the
transformation would override all other transformations that may have
occurred during task evaluation.
This commit updates StateTransform to act very similarly to the
transformState attribute. Instead of wrapping a `State` instance, it now
wraps a transformation function from `State => State`. This function
can be applied on top of or prior to the other transformations via the
`transformState`.
For binary compatibility with 1.3.0, I had to add the stateProxy
function as a constructor parameter in order to implement the `state`
method. The proxy function will generally throw an exception unless the
legacy api is used. To avoid that, I private[sbt]'d the legacy api so
that binary compatibility is preserved but any builds targeting > 1.4.x
will be required to use the new api.
Unfortunately I couldn't private[sbt] StateTransform.apply(state: State)
because mima interpreted it as a method type change becuase I added
StateTransform.apply(transform: State => State). This may be a mima bug
given that StateTransform.apply(state: State) would be jvm public even
when private[sbt], but I figured it was quite unlikely that any users
were using this method anyway since it was incorrectly implemented in
1.3.0 to return `state` instead of `new StateTransform(state)`.
The linting can take a while for large projects because `Def.compiled`
scales in the number of settings. Even for small projects (i.e. scripted
tests), it takes about 50 ms on my computer. This doesn't change the
current behavior because the default value is true.
This PR includes the values of the `description` and `homepage`
settings into the `ivy.xml` files generated by the `makeIvyXml`
task. It restores the behaviour of sbt 1.2.8 and if `useCoursier`
is set to `false`.
Two things are changed in this PR:
* `IvyXml.content` now adds the `homepage` attribute to the
`description` element if `project.info.homePage` is not empty.
* `CoursierInputsTasks.coursierProject0` now fills the previous
empty `CProject.info` field with the description and homepage.
Closes: #5234
Ref #4211Fixes#4395Fixes#4600
This is a reimplementation of `--addPluginSbtFile`. #4211 implemented the command to load extra `*.sbt` files as part of the global plugin subproject. That had the unwanted side effects of not working when `.sbt/1.0/plugins` directory does not exist. This changes the strategy to load the `*.sbt` files as part of the meta build.
```
$ sbt -Dsbt.global.base=/tmp/hello/global --addPluginSbtFile=/tmp/plugins/plugin.sbt
[info] Loading settings for project hello-build from plugin.sbt ...
[info] Loading project definition from /private/tmp/hello/project
sbt:hello> plugins
In file:/private/tmp/hello/
sbt.plugins.IvyPlugin: enabled in root
sbt.plugins.JvmPlugin: enabled in root
sbt.plugins.CorePlugin: enabled in root
sbt.ScriptedPlugin
sbt.plugins.SbtPlugin
sbt.plugins.SemanticdbPlugin: enabled in root
sbt.plugins.JUnitXmlReportPlugin: enabled in root
sbt.plugins.Giter8TemplatePlugin: enabled in root
sbtvimquit.VimquitPlugin: enabled in root
```
The current injection of the new nio keys will overwrite any definitions
of those keys in a build source. This is undesirable. The fix is to
create a mapping of scoped keys to settings and for each inject setting
key, if there is a previous key, put that definition after the injected
definition so that it can override it.
It is still possible for progress threads to leak so shut them down if
there are no active tasks. The report0 method will start up a new thread
if a task is added.
In some circumstances, sbt would generate a number of task progress
threads that could run concurrently. The issue was that the TaskProgress
could be shared by multiple EvaluateTaskConfigs if a dynamic task was
used. This was problematic because when a dynamic task completed, it
might call afterAllCompleted which would stop the progress thread. There
also was a race condition because multiple threads calling initial could
theoretically have created a new progress thread which would cause a
resource leak.
To fix this, we modify the shared task progress so that the `stop()`
method is a no-op. This should prevent dynamic tasks from stopping the
progress thread. We also defer the creation of the task thread until
there is at least one active task. This prevents a thread from being
created in the shell.
The motivation for this change was that I found that sometimes there was
a leaked progress thread that would make the shell not really work for
me because the progress thread would overwrite the shell prompt. This
change fixes that behavior and I was able to validate with jstack that
there was consistently either one or zero task progress threads at a
time (zero in the shell, one when tasks were actually running).
When running sbt -Dtask.timings=true, the task timings get printed to
the console which can overwrite the shell prompt. When we use a logger,
the timing lines are correctly separated from the prompt lines.
The completions were generating page numbers that didn't make sense if
there were a small number of scripted tests. For example, suppose that
there were only two tests defined, it would generate *1of3 *2of3 and
*3of3 completions even though there weren't even three tests.
The way clean was implemented, it was running `clean`, `ivyModule` and
`streams` concurrently. This was problematic because clean could blow
away files needed by `ivyModule` and `streams`. To fix this, move the
cleanCachedResolutionCache into a separate task and run that before the
normal clean.
Should fix https://github.com/sbt/sbt/issues/5067.
Fixes https://github.com/sbt/sbt/issues/3183
This implements an input task lintBuild that checks for unused settings/tasks.
Because most settings are on the intermediary to other settings/tasks, they are included into the linting by default. The notable exceptions are settings used exclusively by a command. To opt-out, you can either append it to `Global / excludeLintKeys` or set the rank to invisible.
On the other hand, many tasks are on the leaf (called by human), so task keys are excluded from linting by default. However there are notable tasks that trip up users, so they are opted-in using `Global / includeLintKeys`.
I noticed that when entering the console, I'd often be left with a
supershell line at the bottom of the screen that would eventually get
interlaced with my console commands. This can be eliminated by clearing
the supershell progress before evaluating the task if it is one of the
skip tasks.
Fixes https://github.com/sbt/sbt/issues/1673
There's been report of intermittent "Could not create directory" error related to "classes.bak." retronym identified that all configurations are using the same directory, and that might be the cause of race condition.
This addresses the issue by assigning a unique directory for each configuration.
These classloaders which are created if sbt is launched with a legacy
launcher (or one that doesn't follow the current classloading hierarchy
convention), were implemented in scala, but that meant that they were
not parallel capable. I fix that by moving the implementations to java.
I also move the static method that creates a MetaBuildLoader into the
java class.
A number of users were reporting issues with deadlocking when using
1.3.2: https://github.com/sbt/sbt/issues/5116. This seems to be because
most of the sbt created classloaders were not actually parallel capable.
In order for a classloader to be registered as a parallel capable, ALL
of the parent classes except for object in the class hierarchy must be
registered as a parallel capable:
https://docs.oracle.com/javase/8/docs/api/java/lang/ClassLoader.html#registerAsParallelCapable--.
If a classloader is not registered as parallel capable, then a global
lock will be used internally for classloading and this can lead to deadlock.
It is impossible to register a scala 2 classloader as parallel capable
so I ported all of the classloaders to java.
This commit updates the java-serialization scripted test. Prior to the
port, the new version of the test would more or less always deadlock.
After this change, I haven't been able to reproduce a deadlock.
This had no significant performance impact when I reran
https://github.com/eatkins/scala-build-watch-performance
Fixes#1458
Running sbt from `/` results to sbt getting stuck trying to load the directories recursively, and eventually erroring with a java.lang.OutOfMemoryError (after freezing for a long time) even on an Alpine container.
To prevent it, this adds a check to see if the absolute path is `/` or not.
```
/ $ sbt -Dsbt.version=1.4.0-SNAPSHOT
[error] java.lang.IllegalStateException: cannot run sbt from root directory without -Dsbt.rootdir=true; see sbt/sbt#1458
[error] Use 'last' for the full log.
```
The message:
```
Project loading failed: (r)etry, (q)uit, (l)ast, or (i)gnore?
```
is not explicit about retry being the option used when pressing return.
I don't think that dummy tasks really make sense for task progress
because they are evaluated outside of the normal task evaluation. This
came up because I was seeing streams-manager in supershell which didn't
seem useful.
I noticed some flickering in super shell progress lines and realized
that it was because there were multiple progress threads running
concurrently. This is problematic because each thread has a completely
different state so if each thread has an active task, the display will
flicker between the two tasks. I think this is caused primarily by
dynamic tasks. At least the example where I was seeing it was caused by
a dynamic task.
If a project had a meta-meta build (project/project), the build sources
in the project directory were ignored. This was because the projectGlobs
method did not correctly handle recursion. It inadvertently
discarded the accumulator globs and only returned the most recently
generated globs. This commit fixes that and adds a regression test to
the nio/reload scripted test.
I was looking into sbt start up time and in profiling was able to
identify a number of classloading bottlenecks. To speed up
initialization, we can preload those classes in the background. I saw
average speedups of roughly .75 seconds after this change. Also, the `time`
command would consistently report cpu system time very close to 400% and
I have 4 cores on my laptop. With 1.3.0 it would be more like 350%.
There have been a number of complaints about the new classloader closing
behavior. It is too aggressive about closing classloaders after test and
run. This commit softens the behavior by allowing a classloader to be
resurrected after close by creating a new zombie classloader that has
the same urls as the original classloader. After this commit, we always
close the classloaders when we are done, but they can still leak
file descriptors if a zombie is created.
To configure the behavior, I add the allowZombieClassLoaders key. If it
is false (which is default), we will warn but still allow them. If it
is true, then we silence the warning. In a later version of sbt, we can
change the semantics to be strict.
I verified after this change that I could add a shutdown hook in `run`
and it would be evaluated so long as I set `bgCopyClasspath := false`.
Otherwise the needed jars were deleted before the hooks could run.
Bonus: delete unused ResourceLoaderImpl class
Fixes#5070
This adds a new setting called `includePluginResolvers` (default `false`).
When set to `true`, it the project will include resolvers from the metabuild.
This allows the build user to declare a resolver in one place (`project/plugins.sbt`) that gets applied to both the metabuild as well as all the subprojects. The scenario comes up when someone distributes a software on their own repo. Ref #4103
We want to recursively monitor the project meta build, but we also want
to avoid listing directories that don't exist. To compromise, I rework
the buildSourceFileInputs to add the nested project directories if they
exist. Because the fileInputs are a setting, this means that adding a
new project directory and *.sbt or *.scala will not immediately trigger
a rebuild, but in most common cases, it will. I added a scripted test
for this.
In sbt 1.3.0, we only monitor build sources in the root project
directory and the root project meta build directory. This commit adds
these inputs for each project.
Fixes https://github.com/sbt/sbt/issues/5061.
This was an oversight that caused consoleQuick to not work with
supershell. We should probably try to figure out a way to allow custom
tasks to black list themselves from super shell reporting.
In https://github.com/sbt/sbt/issues/5075 we realized that sbt 1.3.0
introduces a regression where it closes the classloader used to invoke
the main method for in process run before all of its non-daemon threads
have terminated. To fix this and still close the classloader, I add a
method, runInBackgroundWithLoader that provides the background job
services with an optional classloader that it can close after the job
completes.
This cleanly merges and works with 1.3.x as well.
Fixes https://github.com/sbt/sbt/issues/5047
When setting swoval.tmpdir via globalBase, changed to set globalBase as absolute path.
`com.swoval.runtime.NativeLoader.loadPackaged` uses `java.lang.System.load`.
It requires absolute path, so we should set `swoval.tmpdir` with absolute path.
During refactoring, these warnings got out of date. I also added
scaladoc to the watchTriggeredMessage key.
Ref: https://github.com/sbt/sbt/issues/5051.
To avoid reliance on jvm global variables, we need to share the super
shell state with each of the console appenders that write to the console
out. We only set the progress state for the console appenders for the
screen. This prevents messages that are below the global logging level
from modifying the progress state without preventing them from being
written to other appenders.
The ability to set the ProgressState for each of the console appenders
is added in a companion util PR.
I verified that the test output of io/test was correctly written to the
streams after this change (there were no progress lines in the output).
Disabling supershell when color mode is disabled is a sensible default
(especially for piped output). However, I think it should still be
possible to use supershell in no color mode.
This requires a util change that also enables supershell in no color
mode.
It is redundant and slow to restamp all of the dependency classpath
files when they have likely already been stamped by a subproject.
For the classfiles of subprojects, we fill the managedFileStampCache
with the values returned by the zinc compile analysis product stamps.
This is why they are probably already in the managed cache and should be
up to date so long as zinc is working correctly.
I noticed that various outputFileStamps tasks were showing up in the
task timing report when I ran Test / definedTests in the main sbt project.
That task became about 400ms faster after this change.
I noticed that the reports generated when using sbt.task.timings=true
made very little sense. They were displaying timings for tests that
couldn't possibly have been run. I tracked this down to the TaskTimings
be stored in the progressReport setting which meant they were reused
across multiple task runs. After this change, the reports made a lot
more sense.
The tab completions for scripted have long been broken. They display a
number of non-sensical pages like '*0of9' or '*1of0'. Some of the
multiparser changes seem to have caused these invalid
In 5eab9df0df, I updated the
outputFileStamps task to compute all of the stamps for a directory
recursively if an output file is a directory. Prior to that, it had only
computed the stamp for the directory itself. This caused a significant
performance regression in creating the test classloader because it was
computing the last modified time for all of the classfiles in the class path.
The test for 5000 source files in
https://github.com/eatkins/scala-build-watch-performance was running roughly
400ms slower due to this regression.
Ref https://github.com/sbt/sbt/issues/4905
This is a companion PR to https://github.com/sbt/librarymanagement/pull/318.
This will print the following warnings:
```
sbt:hello> compile
[warn] insecure HTTP request is deprecated 'Artifact(jsoup, jar, jar, None, Vector(), Some(http://jsoup.org/packages/jsoup-1.9.1.jar), Map(), None, false)'; switch to HTTPS or opt-in using from(url(...), allowInsecureProtocol = true) on ModuleID or .withAllowInsecureProtocol(true) on Artifact
[warn] insecure HTTP request is deprecated 'http://repo.typesafe.com/typesafe/releases/'; switch to HTTPS or opt-in as ("Typesafe Releases" at "http://repo.typesafe.com/typesafe/releases/").withAllowInsecureProtocol(true)
[warn] insecure HTTP request is deprecated 'http://repo.typesafe.com/typesafe/releases/'; switch to HTTPS or opt-in as ("Typesafe Releases" at "http://repo.typesafe.com/typesafe/releases/").withAllowInsecureProtocol(true)
[warn] insecure HTTP request is deprecated 'http://repo.typesafe.com/typesafe/releases/'; switch to HTTPS or opt-in as ("Typesafe Releases" at "http://repo.typesafe.com/typesafe/releases/").withAllowInsecureProtocol(true)
[warn] insecure HTTP request is deprecated 'Patterns(ivyPatterns=Vector(), artifactPatterns=Vector(http://repo.typesafe.com/typesafe/releases/[organisation]/[module](_[scalaVersion])(_[sbtVersion])/[revision]/[artifact]-[revision](-[classifier]).[ext]), isMavenCompatible=true, descriptorOptional=false, skipConsistencyCheck=false)'; switch to HTTPS or opt-in as Resolver.url("Typesafe Ivy Releases", url(...)).withAllowInsecureProtocol(true)
[warn] insecure HTTP request is deprecated 'Patterns(ivyPatterns=Vector(), artifactPatterns=Vector(http://repo.typesafe.com/typesafe/releases/[organisation]/[module](_[scalaVersion])(_[sbtVersion])/[revision]/[artifact]-[revision](-[classifier]).[ext]), isMavenCompatible=true, descriptorOptional=false, skipConsistencyCheck=false)'; switch to HTTPS or opt-in as Resolver.url("Typesafe Ivy Releases", url(...)).withAllowInsecureProtocol(true)
[warn] insecure HTTP request is deprecated 'Patterns(ivyPatterns=Vector(), artifactPatterns=Vector(http://repo.typesafe.com/typesafe/releases/[organisation]/[module](_[scalaVersion])(_[sbtVersion])/[revision]/[artifact]-[revision](-[classifier]).[ext]), isMavenCompatible=true, descriptorOptional=false, skipConsistencyCheck=false)'; switch to HTTPS or opt-in as Resolver.url("Typesafe Ivy Releases", url(...)).withAllowInsecureProtocol(true)
```
I inadvertently changed the semantics of clean so that cleanFiles would
only delete the file if it was a regular file. In older versions of sbt,
if a file in cleanFiles was a directory, it would be recursively
deleted.
During refactoring of Continuous, I inadvertently changed the semantics
of `~` so that all multi commands were run regardless of whether or not
an earlier command had failed. I fixed the issue and added a regression
test.
It was reported in https://github.com/sbt/sbt/issues/4973 that the
scalaVersion setting was not being correctly set in a script running
with ScriptMain using 1.3.0-RC4. Using git bisect, I found that the
issue was introduced in
73cfd7c8bd.
That commit manipulates the classloaders passed in by the launcher, but
only for the xMain entry point. I found that the script ran correctly if
I updated the classloader for ScriptedMain as well.
After these changes, the example script in #4973 correctly prints 2.13.0
for the scala version with a locally published sbt.
Bonus: rename xMainImpl object xMain. It was private[sbt] anyway.
https://github.com/sbt/sbt/issues/4986 reported that +compile would
always recompile everything in the project even when the sources hadn't
changed. This was because the dependency classpath was changing between
calls to compile, which caused the external hooks cache introduced in
32a6d0d5d7 to invalidate the scala
library. To fix this, I cache the file stamps on a per scala version
basis. I added a scripted test that checks that there is no
recompilation in two consecutive calls to `+compile` in a multi scala
version build. It failed prior to these changes.
I looked for serial bottlenecks in sbt project loading and discovered
that KeyIndex.aggregate was relatively easily parallelizable. Before
this time, it took about 1 second to run KeyIndex.aggregate in the akka
project on my computer. After this change, it took 250ms. Given that I
have 4 logical cores, the speedup is roughly linear.
It turns out that injecting the keys necessary for incremental tasks
causes a significant startup penalty for many larger projects. For
example, akka starts up about 3 seconds faster if do not inject these
settings for the tasks returning `File` or `Seq[File]`. Given that all
of these apis use java.nio.file anyway, it makes sense to not backport
them to older tasks.
The clean task got a lot slower in 1.3.0
(https://github.com/sbt/sbt/issues/4972). The reason for this was that
sbt 1.3.0 generated many custom clean tasks for any tasks that returned
`File` or `Seq[File]`. Each of these tasks was tagged with
Tags.Clean which meant that only one of them could run at a time. As a
result, it took a long time to evaluate all of the custom tasks, even if
they were no-ops. In the akka project, a no-op clean was taking 35
seconds which is simply unacceptable. After this change, a no-op clean
takes less than a second in akka (a full clean only takes about 6
seconds after running test:compile)
To fix this, I stopped aggregating the clean task across configs and
projects. Because I removed the aggregation, I needed to manually
implement clean in the `Compile` and `Test` configurations to make
`Compile / clean` and `Test / compile` clean work correctly.
Fixes#4964
Together with https://github.com/sbt/util/pull/211, this brings back stack trace supression for custom tasks by default.
Debug levels logs are available in `last`, and this prints a message informing the user of the fact. BLUE on dark background is difficult to read, so I am chaning the color hilight to MAGENTA.
After adding the automatic lookup to external hooks for missing binary
jars, the scripted test dependency-management/invalidate-internal
started failing. This was because the previous analysis contained a jar
dependency that still existed on disk but was no longer a part of the
dependency classpath. Fundamentally the problem is that the zinc
compile analysis is not tightly coupled with the sbt build state.
To fix this, we can cache the dependency classpath file stamps in the
same way that we cache the input file stamps in external hooks and
manually diff them at the sbt level. We then force updates regardless of
the difference between the zinc state and the sbt state.
It was reported that in community builds, sometimes there was
spurious over-compilation due to invalidation of the scala library jar
(https://github.com/sbt/sbt/issues/4948). The reason for this was that
the external hooks prefills the managed cache with all of the time
stamps for the project dependencies but was not looking up any jars that
weren't in the cache. I suspect I did this because I didn't realize that
zinc also includes its own classpath in the binaries which is not
a part of the dependencyClasspath. The fix is to just add the jar to the
cache if it doesn't already exist by switching to getOrElseUpdate from
get.
I followed the steps in #4948 and published a version of sbt locally
with this change and the spurious re-builds stopped.
In the code formatting use case, the formatting task may modify the
source files in place. If the formatting task uses the nio
inputFileStamps, then it would fill the in-memory cache of source paths
to file stamps. This would cause compile to see the pre-formatted
stamps. To fix this, we can invalidate the file cache entries for the
outputs of a task. This will cause the side-effect of some extra io
because the hashes may be computed three times: once for the format
inputs, once for the format outputs and once for the compile inputs. I
think most users would understand that adding auto-formatting would
potentially slowdown compilation.
To really prove this out, I implemented a poor man's scalafmt plugin in
a scripted test. It is fully incremental. Even in the case when some
files cannot be formatted it will update all of the files that can be
formatted and not re-format them until they change.
It makes sense to add a scope for the `fileTreeView` key where it is
available. At the moment, there is only one `fileTreeView`
implementation but, if that changes down the road, these tasks will
automatically inherit the correct view.
It makes sense for the new glob/nio based apis that we provide first
class support for filtering the results. Because it isn't possible to
scope a task within a task within a task, i.e.
`compile / fileInputs / includePathFilter`, I had to add four new
filter settings of type `PathFilter`:
fileInputIncludeFilter :== AllPassFilter.toNio,
fileInputExcludeFilter :== DirectoryFilter.toNio || HiddenFileFilter,
fileOutputIncludeFilter :== AllPassFilter.toNio,
fileOutputExcludeFilter :== NothingFilter.toNio,
Before I was effectively hard-coding the filter: RegularFileFilter &&
!HiddenFileFilter in the inputFileStamps and allInputFiles tasks. These
remain the defaults, as seen in the fileInputExcludeFilter definition
above, but can be overridden by the user.
It makes sense to exclude directories and hidden files for the input
files, but it doesn't necessarily make sense to apply any output filters
by default. For symmetry, it makes sense to have them, but they are
unlikely to be used often.
Apart from adding and defining the default values for these keys, the
only other changes I had to make was to remove the hard-coded filters
from the allInputFiles and inputFileStamps tasks and also add the
filtering to the allOutputFiles task. Because we don't automatically
calculate the FileAttributes for the output files, I added logic for
bypassing the path filter application if the PathFilter is effectively
AllPass, which is the case for the default values because:
AllPassFilter.toNio == AllPass
NothingFilter.toNio == NoPass
AllPass && !NoPass == AllPass && AllPass == AllPass
Prior to this commit, change tracking in sbt 1.3.0 was done via the
changed(Input|Output)Files tasks which were tasks returning
Option[ChangedFiles]. The ChangedFiles case class was defined in io as
case class ChangedFiles(created: Seq[Path], deleted: Seq[Path], updated: Seq[Path])
When no changes were found, or if there were no previous stamps, the
changed(Input|Output)Files tasks returned None. This made it impossible
to tell whether nothing had changed or if it was the first time.
Moreover, the api was awkward as it required pattern matching or folding
the result into a default value.
To address these limitations, I introduce the FileChanges class. It can
be generated regardless of whether or not previous file stamps were
available. The changes contains all of the created, deleted, modified
and unmodified files so that the user can directly call these methods
without having to pattern match.
It is tedious to write (foo / allInputFiles).value so I added simple
extension method macros that expand `foo.inputFiles` to
(foo / allInputFiles).value and `foo.outputFiles` to
`(foo / allOutputFiles).value`.
While writing documentation for the new file management/incremental
task evaluation features, I realized that incremental task evaluation
did not have the correct semantics. The problem was that calls to
`.previous` are not scoped within the current task. By this, I mean that
say there are tasks foo and bar and that the defintion of bar looks like
bar := {
val current = foo.value
foo.previous match {
case Some(v) if v == current => // value hasn't changed
case _ => process(current)
}
}
The problem is that foo.previous is stored in
effectively (foo / streams).value.cacheDirectory / "previous". This
means that it is completely decoupled from foo. Now, suppose that the
user runs something like:
> set foo := 1
> bar // processes the value 1
> set foo := 2
> foo
> bar // does not process the new value 2 because foo was called, which updates the previous value
This is not an unrealistic scenario and is, in fact, common if the
incremental task evaluation is changed across multiple processing steps.
For example, in the make-clone scripted test, the linkLib task processes
the outputs of the compileLib task. If compileLib is invoked separately
from linkLib, then when we next call linkLib, it might not do anything
even if there was recompilation of objects because the objects hadn't
changed since the last time we called compileLib.
To fix this, I generalizedthe previous cache so that it can be keyed on
two tasks, one is the task whose value is being stored (foo in the
example above) and the other is the task in which the stored task value
is retrieved (bar in the example above). When the two tasks are the
same, the behavior is the same as before.
Currently the previous value for foo might be stored somewhere like:
base_directory/target/streams/_global/_global/foo/previous
Now, if foo is stored with respect to bar, it might be stored in
base_directory/target/streams/_global/_global/bar/previous-dependencies/_global/_gloal/foo/previous
By storing the files this way, it is easy to remove all of the previous
values for the dependencies of a task.
In addition to changing how the files are stored on disk, we have to store
the references in memory differently. A given task can now have multiple
previous references (if, say, two tasks bar and baz both depend on the
previous value). When we complete the results, we first have to collect
all of the successful tasks. Then for each successful task, we find all
of its references. For each of the references, we only complete the
value if the scope in which the task value is used is successful.
In the actual implemenation in Previous.scala, there are a number places
where we have to cast to ScopedKey[Task[Any]]. This is due to
limitations of ScopedKey and Task being type invariant. These casts are
all safe because we never try to get the value of anything, we only use
the portion of the apis of these types that are independent of the value
type. Structural typing where ScopedKey[Task[_]] gets inferred to
ScopedKey[Task[x]] forSome x is a big part of why we have problems with
type inference.
Using List instead of vector makes the code a bit more readable. We
don't need indexed access into the data structure so its unlikely that
Vector was providing any performance benefit.
As part of a documentation push, I noticed that these were undocumented
and that there were some public apis in FileStamp that I intended to be
private[sbt].
I realized it was probably not ideal to have these implicit JsonFormats
defined directly in the FileStamp object because they might
inadvertently be brought into scope with a wildcard import.
I noticed that sometimes if I changed a build source and then ran reload
in the shell, I'd still see a warning about build sources having
changed. We can eliminate this behavior by resetting the
hasCheckedMetaBuild state attribute to false and skipping the
checkBuildSources step if the current command is 'reload'. We also now
skip checking the build source step if the command is exit or reboot.
Zinc records all of the compile source file hashes when compilation
completes. This is problematic because its possible that a source file
was changed during compilation. From the user perspective, this may mean
that their source change will not be recompiled even if a build is
triggered by the change.
To overcome this, I add logic in the sbt provided external hooks to
override the zinc analysis stamps. This is done by writing the source
file stamps to the previous cache after compilation completes. This
allows us to see the source differences from sbt's perspective, rather
than zinc's perspective. We then merge the combined differences in the
actual implementation of ExternalHooks. In some cases this may result in
over-compilation but generally over-compilation is preferred to under
compilation. Most of the time, the results should be the same.
The scripted test that I added modifies a file during compilation by
invoking a macro. It then effectively asserts that the file is
recompiled during the next test run by validating the compilation result
in the test. The test fails on the latest develop hash.
Changed resources were causing the dependency layer to be invalidated on
resource changes in turbo mode because the resource layer was in between
the scala library layer. This commit reworks the layers for the
AllDependencyJars strategy so that the top layer is able to load _all_
of the resources during a test run.
The resource layer was added to address the problem that dependencies
may need to be able to load resources from the project classpath but
wouldn't be able to do so if the dependencies were in a separate layer
from the rest of the classpath. The resource layer was a classloader
that could load any resource on the full classpath but couldn't load any
classes. When I added the resource layer, I was thinking that when
resources changed, the resource class loader needed to be invalidated.
Resources, however, are different from classes in that the same
ClassLoader can find the same resources in a different place because
getResource and getResourceAsStream just return locations but do not
actually do any loading.
Taking advantage of this, I add a proxy classloader for finding
resource locations to ReverseLookupClassLoader. We can reset the
classpath of the resource loader in
ReverseLookupClassLoaderHolder.checkout. This allows us to see the new
versions of the resources without invalidating the dependency layer.
The use of '$' in the path names for streams is a pain because, since
'$' is a special character in the shell, it makes it impossible to
directly copy and paste the paths. If we make this change, some builds
will be left with vestigial directories with $global and $build in them
until they run clean. It also would break any scripts that manually
delete these paths. That doesn't seem like a common use case, but it's
worth mentioning.
The use of the persistent file stamp cache between watch runs didn't
seem to cause any issues, but there was some chance for inconsistency
between the file stamp cache and the file system so it makes sense to
put it behind the turbo flag.
After changing the default, the watch/on-change scripted test started
failing. It turns out that the reason is that the file stamp cache
managed by the watch process was not pre-filled by task evaluation. For
this reason, the first time a source file was modified, it was treated
as a creation regardless of whether or not it actually was.
To fix this, I add logic to pre-fill the watch file stamp cache if we
are _not_ persisting the file stamps between runs.
I ran a before and after with the scala build performance benchmark tool
and setting the watchPersistFileStamps key to true reduced the median
run time by about 200ms in the non-turbo case.
There was a bug where sometimes a source file change would not trigger a
new build if the change occurred during a build. Based on the logs, it
seemed to be because a number of redundant events were generated for the
same path and they triggered the anti-entropy constraint of the file
event monitor.
To fix this, I consolidated a number of observers of the global file
tree repository into a single observer. This way, I am able to ensure
that only one event is generated per file event.
I also reworked the onEvent callback to only stamp the file once. It was
previously stamping the modified source file for all of the aggregated
tasks. In the sbt project running `~compile` meant that we were stamping
a source file 22 times whenever the file changed.
This actually uncovered a zinc issue though as well. Zinc computes and
writes the hash of the source file to the analysis file after
compilation has completed. If a source file is modified during
compilation, then the new hash is written to the analysis file even when
the compilation may have been made against the previous version of the
file. Zinc will then refuse to re-compile that file until another change
is made.
I manually verified that in the sbt project if I ran `~compile` before
this change and modified a file during compilation, then no event was
triggered (there was a log message about the event being dropped due to
the anti-entropy constraint though). After this change, if I changed a
file during compilation, it seemed to always trigger, but due to the
zinc bug, it didn't always re-compile.
This was causing an abstract type pattern T is unchecked since it is
eliminated by erasure. It was unneeded because store.get[T] return
Option[(T, Long)]. I'm surprised that the compiler complained about
this.
@olegych reported that sbt would silently swallow the 'compile' command
in the multi-command, 'run;compile;reload'. I tracked this down to the
build source check. When the build has
Global / onChangedBuildSource := ReloadOnSourceChanges, the check build
sources command return a new state with "reload" prefixed. To actually
perform the reload, I returned this modified state with the prefixed
reload command.
There were two problems with this:
1) In the auto-reload case, the current command was not run after the
reload
2) If the multi-command contained reload, the auto-reload check would
have a false positive which triggered the bug in (1)
To fix this, I clear out the remaining commands before I run the check
command. That way, we know that if the remaining commands has a reload,
then it is an auto-reload. We then prefix the state with both the reload
and the current command.
I updated the scripted test for auto-reload to handle multi commands
containing reload.
In this commit, I both restore some sbt 1.2.8 behavior and enhance the
api for setting keyboard shortcuts in watch. I change the default start
message to just show the watch count, the tasks that are being monitored
and, on a new line, the instructions to terminate the watch or show more
options.
Here's what it looks like:
[info] 1. Monitoring source files for spark/compile...
[info] Press <enter> to interrupt or '?' for more options.
?
[info] Options:
[info] <enter> : interrupt (exits sbt in batch mode)
[info] <ctrl-d> : interrupt (exits sbt in batch mode)
[info] 'r' : re-run the command
[info] 's' : return to shell
[info] 'q' : quit sbt
[info] '?' : print options
I also made it so that the new options can be added (and old options
removed) with the watchInputOptions key. For example, to add an option
to reload the build with the key 'l', you could add
ThisBuild / watchInputOptions += Watch.InputOption('l', "reload", Watch.Reload)
to your global build.sbt.
After adding that to my global ~/sbt/1.0/global.sbt file, the output of
'?' became:
[info] Options:
[info] <ctrl-d> : interrupt (exits sbt in batch mode)
[info] <enter> : interrupt (exits sbt in batch mode)
[info] '?' : print options
[info] 'l' : reload
[info] 'q' : quit sbt
[info] 'r' : re-run the command
[info] 's' : return to shell
At the suggestion of @eed3si9n, instead of specifying the file cache
size in bytes, we now specify it in a formatted string. For example,
instead of specifying 128 megabytes in bytes (134217728), we can specify
it with the string "128M".
It can be quite slow to read and parse a large json file. Often, we are
reading and writing the same file over and over even though it isn't
actually changing. This is particularly noticeable with the
UpdateReport*. To speed this up, I introduce a global cache that can be
used to read values from a CacheStore. When using the cache, I've seen
the time for the update task drop from about 200ms to about 1ms. This
ends up being a 400ms time savings for test because update is called for
both Compile / compile and Test / compile.
The way that this works is that I add a new abstraction
CacheStoreFactoryFactory, which is the most enterprise java thing I've
ever written. We store a CacheStoreFactoryFactory in the sbt State.
When we make Streams for the task, we make the Stream's
cacheStoreFactory field using the CacheStoreFactoryFactory. The
generated CacheStoreFactory may or may not refer to a global cache.
The CacheStoreFactoryFactory may produce CacheStoreFactory instances
that delegate to a Caffeine cache with a max size parameter that is
specified in bytes by the fileCacheSize setting (which can also be set
with -Dsbt.file.cache.size). The size of the cache entry is estimated by
the size of the contents on disk. Since we are generally storing things
in the cache that are serialized as json, I figure that this should be a
reasonable estimate. I set the default max cache size to 128MB, which is
plenty of space for the previous cache entries for most projects. If the
size is set to 0, the CacheStoreFactoryFactory generates a regular
DirectoryStoreFactory.
To ensure that the cache accurately reflects the disk state of the
previous cache (or other cache's using a CacheStore), the Caffeine cache
stores the last modified time of the file whose contents it should
represent. If there is a discrepancy in the last modified times (which
would happen if, say, clean has been run), then the value is read from
disk even if the value hasn't changed.
* With the following build.sbt file, it takes roughly 200ms to read and
parse the update report on my compute:
libraryDependencies += "org.apache.spark" %% "spark-sql" % "2.4.3"
libraryDependencies += "org.scalatest" %% "scalatest" % "3.0.1"
This is because spark-sql has an enormous number of dependencies and the
update report ends up being 3MB.
Fixes#4802
For Ivy integration sbt uses credential task in a peculiar way. 9fa25de84c/main/src/main/scala/sbt/Defaults.scala (L2271-L2275)
This lets the build user put `credential` task in various places, like metabuild or root project, but they would all act as if they were scoped globally. This PR adds `allCredentials` task to emulate the behavior to pass credentials into lm-coursier.
This Resolver had the same name as the typesafe ivy resolver specified
in the launcher boot.properties. It was creating a number of verbose
warnings about having multiple resolvers with the same name. I noticed
that the ivy pattern is slightly different for the boot resolver with
this name. It didn't seem to be causing any problems to have both
resolvers.
Fixes#4839
I noticed that for a simple spark project that evaluating the test task
was faster than running run when both tasks evaluated the same code
block. I tracked this down to the BackgroundJobService.copyClasspath
method. This method was hashing the jar contents of all of the files in
the build. On my computer, this took 600ms (for context, the total run
time of the `run` task was around 1.2 seconds, which included about
150ms of scala compiling and 350ms of time in the main method). If
instead we use the last modified time it drops down to 5-10ms. As
predicted, the total runtime of `run` in this project dropped down to
600ms which was on par with `test`.
I am not sure why a hash was used rather than last modified in the first place,
so I reworked things in such a way that, by default, sbt will use a hash
but if turbo mode is on, it will use the last modified instead. We can
revisit the default later.
The multi parser had very poor performance if there were many commands.
Evaluating the expansion of something like "compile;" * 30 could cause
sbt to hang indefinitely. I believe this was due to excessive
backtracking due to the optional `(parser <~ semi.?).?` part of the
parser in the non-leading semicolon case.
I also reworked the implementation so that the multi command now has a
name. This allows us to partition the commands into multi and non-multi
commands more easily in State while still having multi in the command
list. With this change, builds and plugins can exclude the multi parser
if they wish.
Using the partitioned parsers, I removed the high/priority low priority
distinction. Instead, I made it so that the multi command will actually
check if the first command is a named command, like '~'. If it is, it
will pass the raw command argument with the named command stripped out
into the parser for the named command. If that is parseable, then we
directly apply the effect. Otherwise we prefix each multi command to the
state.
The `session save` command has the side effect of modifying a "*.sbt"
file so we don't want to warn about changes or automatically reload when
we return to the shell. Setting the hasCheckedMetaBuild attribute key to
false is sufficient to prevent this.
Ref: https://github.com/sbt/sbt/issues/4813
It didn't really make sense for Continuous to use the other command
parser and then reparse the results. I was able to slightly simplify
things by using the multi parser directly.
It was reported in https://github.com/sbt/sbt/issues/4808 that compared
to 1.2.8, sbt 1.3.0-RC2 will truncate the command args of an input task
that contains semicolons. This is actually intentional, but not
completely robust. For sbt >= 1.3.0, we are making ';' syntactically
meaningful. This means that it always represents a command separator
_unless_ it is inside of a quoted string. To enforce this, the multi parser
will effectively split the input on ';', it will then validate that each
command that it extracted is valid. If not, it throws an exception. If
the input is not a multi command, then parsing fails with a normal
failure.
I removed the multi command from the state's defined commands and reworked
State.combinedParser to explicitly first try multi parsing and fall back
to the regular combined parser if it is a regular command. If the multi
parser throws an uncaught exception, parsing fails even if the regular
parser could have successfully parsed the command. The reason is so that
we do not ever allow the user to evaluate, say 'run a;b'. Otherwise the
behavior would be inconsitent when the user runs 'compile; run a;b'
It's very expensive to compute the hash code of a deeply nested
Incomplete. To prevent a loop, we only want to check for object equality
which we can do with IdentityHashMap
It didn't make sense to aggregate the watch start command if it was
defined in multiple sources so we previously just fell back to the
default message if multiple commands were being run. This, however,
meant that if you ran, say, ~compile in an aggregate project, it wasn't
possible to customize the start message. There was a message in the
sbt gitter channel where someone found the new message too verbose and
wanted to print something shorter and I realized that this was an
unfortunate restriction. Instead of giving up, we can just use the
project's watchStartMessage as a default. If the watchStartMessage
setting is unset for some reason, we can fall back to the default.
I validated this change manually in the swoval project, which has an
aggregate root project, by running
set ThisBuild / watchStartMessage := { (_, _, _) => None }
and indeed nothing was printed after each task evaluation in '~compile'.
We discovered that turbo mode did not work in the sbt settings project.
I tracked this down to the dependency classloader bundle not being
thread safe.
I realized that some builds may crash if we automatically close the
classloaders. While I do think that is a good thing in general that we
are closing the loaders by default, we shuold have an option for
supressing this behavior.
I made all of the custom classloaders that we define for test and run
check this property before calling the super.close method.
We want to notify users about the new features available in sbt 1.3.0 to
increase visibility. Turbo mode especially can benefit many builds, but
we have opted to leave it off by default for now.
The banner will be displayed the first time sbt enters the shell command
on each sbt run. The banner can be disabled globally with the sbt.banner
system property. It can be displayed on a per sbt version by running the
skipWelcomeBanner command. That command touches a file in the ~/.sbt/1.0
directory to make it persistent across all projects.
I decided creations/deletions/updates were a bit too technical rather
than descriptive. It also wasn't really correct to say Meta build
sources because the meta build is the build for the build. Instead, I
dropped Meta from the sentence. I also made the instructions when
changed sources are detected more active. I left them capitalized since
they are instructions rather than warnings.
Apply these changes by running `reload`.
Automatically reload the build when source changes are detected by setting `Global / onChangedBuildSource := ReloadOnSourceChanges`.
Disable this warning by setting `Global / onChangedBuildSource := IgnoreSourceChanges`.
Also indentation was wrong for the printed files when multiple files had
changed because the mkString middle argument was " \n" rather than "\n ".
This creates a performance mode that enables experimental or advanced features that might require some debugging by the build user when it doesn't work.
Initially we are putting the layered ClassLoader (`ClassLoaderLayeringStrategy.AllLibraryJars`) behind this flag.
My first attempt, cc8c66c66d, at making
java reflection work with a layered classloader with a dependency jar
layer was a failure. It would generally work ok the first time the user
ran test, but was likely to fail on a second run.
There were a number of problems with the strategy:
1) It relied on the thread's context class loader to determine where to
attempt the reverse lookup.
2) It is not possible to ever reload classes loaded by a classloader.
Consider the classloading hierarchy a <- b, where
the arrow implies that a is the parent of b. I incorrectly thought
that a's loadClass method would be called every time a class loaded
by a made a call to Class.forName(name: String). This turns out to
not be the case. As a result, the second time the dependency layer
was used, where now the hierarchy is a <- c, that same Class.forName
call could return a Class from b which causes a nasty crash.
It isn't possible to work around the limitation in 2 so the only option
that allows both caching and java reflection across layers to work is to
cache the dependency layer, but invalidate when cross layer reflection
occurs. This turns out to be straightforward to implement. The
performance looks very similar to the ScalaLibrary strategy when java
reflection is used, which makes sense because the scala library and
scala reflect layers are still reused when the dependency layer is
invalidated.
I also stopped passing around the resource map to all of the layers.
Resource loading is hierarchical and the resource layer comes above the
dependency layer in the stack so there was no need for the bottom layers
to also be RawResource loaders.
While testing some classloader changes, I realized that we didn't always
close the test classloaders because they didn't necessarily extend
URLClassLoader, but instead implemented AutoCloseable.
Bonus: don't set the context classloader. It turns out that the test
framework does that anyway inside of trl.run so it was pointless to do
in Defaults.scala.
The Reload exception that I added in the sbt package really wasn't
intended to be public. It's only meant to be used by
checkMetaBuildSources, which the users shouldn't override. I put it in
the top package though because I wanted it to be next to FullReload. I
also am not sure why the Reload object in Watch was private[sbt], but
while writing documentation, I realized that users couldn't access it.
While writing documentation for the watch subsystem, I realized that
it's awkward to configure watch to clear the screen before task
evaluation. To make this easier, I added a setting watchBeforeCommand
which is an arbitrary function that will run before the watch process
evaluates the command(s).
I also added helper functions for adding clear screen functionality.
I also realized that we weren't using the watchOnEnter or
watchOnExit callbacks anywhere. I had added these to support setting up
some state before watch starts and cleaning it up before it exits for
plugin authors. It makes sense to remove that functionality for 1.3.0
and only if a need presents itself re-add it in a later version of sbt.
I also made a few apis private[sbt] that I'm not sure about. Writing
documentation made me realize that some of these are redundant and/or
not ready for general consumption.
I had previously set some of the watch settings at the global level and
some at the project level. While writing documentation for the new watch
subsystem, I realized that the defaults should be set globally so that
they can be overridden at the ThisBuild level.
I also moved watchTriggers to sbt.nio.Keys. It was an oversight that it
wasn't moved there in a5cefd45be.
The classpath filter for test and run was added in #661 to ensure that
the classloaders were correctly isolated. That is no longer necessary
with the new layering strategies that are more precise about which jars
are exposed at each level. Using the filtered classloader was causing
the reflection used by spark to fail when using java 11.
We were incorrectly building the dependency layer in the run task using
the raw jars from dependencyClasspath rather than the actual classpath
jars (which may be different if bgCopyClasspath is true -- which it is
by default). This was preventing spark from working with AllLibraryJars
because it would load its classes and resources from the coursier cache
but the classpath filter would reject the resources because they came
from the coursier cache instead of the classpath.
The docs for ClassLoader,
https://docs.oracle.com/javase/8/docs/api/java/lang/ClassLoader.html
say that all non-hierarchical custom classloaders should be registered
as parallel capable. The docs also suggest that custom classloaders
should try to only override findClass so I reworked LayerdClassLoader to
only override findClass. I also added locking to the class loading to
make it safe for concurrent loading.
All of the custom classloaders besides LayeredClassLoader either
subclass URLClassLoader or LayeredClassLoader but don't override
loadClass. Because those two classloaders are parallel capable, the
subclasses should be as well. It isn't possible to make classloaders
that are implemented in scala parallel capable because scala 2 doesn't
support jvm static blocks (dotty does support this with an annotation).
To work around this, I re-worked some of the classloaders so that they
are either directly implemented in java or I subclassed a scala
implementation class in java.
Jave reflection did not work with layered classloaders if a dependency
attempted to load a class that was below the dependency layer in the
layered classloader hierarchy. The underlying problem was (in general) a
call to Class.forName somewhere. If the classloader parameter is not
specified, then Class.forName locates the ClassLoader for the caller
using reflection. It ultimately delegates to that ClassLoader's
loadClass method. With the previous LayeredClassLoader class, there was
no way for that classloader to access a URL that was below it in the
class loading hierarchy. I reworked LayeredClassLoader so that if it
fails to load the class, it will check the Thread's context classloader
and see if there are other LayeredClassLoader instances below it. If so,
it will then check if any of those classloaders would be able to load
the class by using findResource. If the descendant loader can load the
class, then we manually load it with findClass.
For best caching performance, we want to use the scala-reflect.jar that
is found in the scala instance. Also, in the runtime configuration,
caching didn't work correctly because we filtered the scala reflect
library from the dependency jars. We really only wanted to filter out
the library jars.
It also was problematic to use a LayeredClassLoader for the scala
reflect layer because in a subsequent commit I add the capability for a
layered classloader to load classes from its descendant loaders. This
caused problems when the scala-reflect layer was a LayeredClassLoader.
Instead, I add the ScalaReflectClassLoader class for better reporting.
I had previously not though there was much reason to support commands in
continuous builds. This was primarily because there were a number of
questions regarding semantics. Commands cannot have fileInputs
specifically assigned to them because they don't have an associated
scope. They also can arbitrarily modify state so what is the expectation
when running ~foo where foo is a command that, for example, replaces
itself. I settled on the following semantics:
1) Commands run in a continuous build cannot modify the sbt execution
state which is to say that the state that is returned by continuous
is the same that was passed in (unless a reload occurred or we exited
the build with an exception)
2) Any global watchTriggers or fileInputs apply to a watched command.
They automatically inherit any fileInputs that are queried when
running tasks in a command. So, for example, ~+compile does what
you'd expect.
The implementation is fairly straightforward. If we can successfully
parse a command, but we cannot parse a scopedKey from it, we assign it a
private ScopedKey. When computing the watch settings for that key, we
will select the global settings through delegation. This is how it picks
up the global watchTriggers.
To run the command, I had to rework the task evaluation portion because
a command may return a state with additional commands to run. The cross
build command works this way. We recursively run all of the commands
starting with the original until we run out of commands to run. As part
of this work, I was able to remove the three argument version of
Command.processCommand that I'd previously added to support my old
approach to evaluating commands. This was a nice bonus.
I added scripted tests that check that global watchTriggers are picked
up and that commands that delegate to a command that uses fileInputs
automatically pick up those inputs during the watch. I also added a test
that approximates the ~+compile use case and ensures that the failure
semantics are what we expect and that the task runs for all defined
scala versions.
I noticed that sbt 1.3.0 was using more cpu when idling (either at the
shell or while waiting for file events) than 1.2.8. This was because I'd
reduced a number of timeouts to 2 milliseconds which was causing a
thread to keep waking up every 2 milliseconds to poll a queue. I thought
that this was cheaper than it actually is and drove the cpu utilization
to O(10%) of a cpu on my mac.
To address this, I consolidated a number of queues into a single queue
in CommandExchange and Continuous. In the CommandExchange case, I
reworked CommandChannel to have a register method that passes in a Queue
of CommandChannels. Whenever it appends an exec, it adds itself to the
queue. CommandExchange can then poll that queue directly and poll the
returned CommandChannel for the actual exec. Since the main thread is
blocking on this queue, it does not need to frequently wake up and can
just poll more or less indefinitely until a message is received. This
also reduces average latency compared to older versions of sbt since
messages will be processed almost as soon as they are received.
The continuous case is slightly more complicated because we are polling
from two sources, stdin and FileEventMonitor. In my ideal world, I'd
have a reactive api for both of those sources and they would just write
events to a shared queue that we could block on. That is nontrivial to
implement, so instead I consolidated the FileEventMonitor instances into
a single FileEventMonitor. Since there is now only one FileEventMonitor
queue, we can block on that queue for 30 milliseconds and the poll
stdin. This reduces cpu utilization to O(2%) on my machine while still
having reasonably low latency for key input events (the latency of file
events should be close to zero since we are usually polling the
FileEventMonitor queue when waiting).
I actually had a TODO about the FileEventMonitor change that this
resolves.
I noticed that ~run in the Play plugin relied on the presence of the
ContinuousEventMonitor key. Rather than completely break that feature, I
re-added the ContinuousEventMonitor attribute to the state in a
continuous build. That being said, the play team does need to update
their plugin because reading from the console no longer works in 1.3.0 so the
user has to Ctrl-C to exit the watch. I think the best way for them to
fix this is to override the '~' command in their plugin and if the input
is 'run', then they do their custom thing, otherwise they delegate to
the default '~' command.
Not caching scala reflect is extremely painful if the build uses
scalatest. It adds O(1second) to my watch performance benchmarks. It
actually made sbt 1.3.0 much slower than 0.13.17
I was benchmarking sbt with turbo mode on and found that tests weren't
running. This was because we were inadvertently excluding all of the
dependency jars from the dynamic classpath. I have no idea why the
scripted tests didn't catch this.
The scalatest scripted test didn't catch this because 'test' just
automaticaly succeeds if no test frameworks are found. To guard against
regression, I had to ensure that 'test' failed for every strategy if a
bad test file was present.
We want to check the build sources before any command runs, not just
tasks. To achieve this, I moved the logic for checking for build source
changes to CommandProcess.processCommand. Also, @smarter had noticed
that if a user modified a build file and then ran reload, a warning
would be displayed about changed build sources even though they had just
ran reload. This was because running reload didn't update the previous
cache for checkBuildSources / fileInputStamps. I fixed that bug by
running 'checkBuildSources / changedInputFiles' instead of
'checkBuildSources' when the user runs reload.
I verified that after this change:
- If I changed a build file and ran 'show version' a warning was printed
before it displayed the version. If I also set
global / onChangedBuildSource := ReloadOnSourceChanges, it
automatically reloaded before displaying the version.
- If I changed a build source and ran 'reload', followed by
'show version', no warnings were ever displayed.
As an implementation detail, I had to add the Aggregation.suppressShow
attribute key. We set this key to true before checking the build
sources. Without this, log.success is called whenever we check the build
sources which is both confusing and noisy.
Because we are sharing the scala library classloader with test and run,
it is possible that sbt will be competing with for resources with the
test and run tasks when trying to get threads from the global execution
context. Also, by using our own execution context, we can shut it down
when sbt exits.
The motivation for this change is that I was looking at the active jvm
threads of an idle sbt process and noticed a bunch of global execution
context threads.
@olegych reported in #4721 that play projects would get stuck in a
strange loop where modifying any source file would cause that source
file to always be recompiled every time a build was triggered regardless
of whether or not it was modified. This was because the play project
sets custom watchSources (using the legacy api) that overlap with the
fileInputs.
There were two parts to this fix:
1) When detecting an event, find if any of the dynamic inputs that cover
the glob use a hash. If so, these are file inputs so we want to
update the hash for the path, not the last modified time.
2) Only write hashes into the persistent file stamp cache. Computing the
last modified time is much cheaper than the hash so it makes sense to
avoid ever caching last modified times.
I wrote a scripted test that fails if Continuous writes a last modified
time into the file stamp cache instead of a hash. I also verified
manually that a sample play project no longer exhibits the weird
recompilation behavior.
Fixes#4721
This allows the user to do, for example,
watchTriggeredMessage := { (count, path, commands) =>
println(Watched.clearScreen)
watchTriggeredMessage.value(count, path, commands)
}
Also, there was a bug where I accidentally inadvertently used the
deprecated watch message setting where I meant to use the deprecated
trigger message setting.
Fixes#4696
@olegych reported in https://github.com/sbt/sbt/issues/4722 that
sometimes even when a build was triggered during watch that no
recompilation would occur. The cause of this was that we never
invalidated the file stamp cache for managed sources or output files.
The optimization of persisting the source file stamps between task
evaluations in a continuous build only really makes sense for unmanaged
sources. We make the implicit assumption that unmanaged sources are
infrequently updated and generally one at a time. That assumption does
not hold for unmanaged source or output files.
To fix this, I split the fileStampCache into two caches: one for
unmanaged sources and one for everything else. We only persist the
unmanagedFileStampCache during continuous builds. The
managedFileStampCache gets invalidated every time.
I added a scripted test that simulates changing a generated source file.
Prior to this change, the test would fail because the file stamp was not
invalidated for the new source file content.
Fixes#4722
@japgolly reported in #4695 that aliased commands don't work in watch
anymore. This was because we were extracting the task from the raw
command rather than the aliased command. Since the alias wasn't a valid
key, we weren't able to parse the scoped key. The fix is to find the
aliased value and try that if we fail to parse the original command.
Fixes#4695
In an interactive session, it's possible for task evaluation to trigger
an OOM: Metaspace but for sbt to continue working after that failure.
Moreover, the metaspace oom can be caused by using a dependency
classloader layer. If the user changes the layering strategy, they may
be able to re-run their command successfully.
Instead of caching based on the classpath of the resources, we should
instead cache based on the actual resource files. This commit achieves
that by adding the classpathFiles key which just transforms the
attributed classpath to a Seq[Path]. This implicitly generates the
outputFileStamps key for classpathFiles which we can use to read the
stamps (the file stamp entries for the classpath should get filled by
the compile task so this shouldn't actually cause any additional io).
The ShareRuntimeDependenciesLayerWithTestDependencies strategy doesn't
really work with resources, so it makes sense to get rid of it. Without
the share layer, there is no point in having separate
RuntimeDependencies and TestDependencies layers so I consolidated them
to Dependencies.
If we really care about binary/source compatibility for the 1.3.0-RCx
series, I can restore the traits and objects and set them private[sbt].
I think it was kind of a bug that they existed at all given the issue
with resources so it makes sense to just remove them.
This makes debugging a bit easier in the eclipse memory analyzer tool
since we get a more specific classloader type than URLClassLoader and by
giving the class a meaningful name, we can tell from where it
originated.
This commit removes the ClassLoaderCache that I'd added for the purpose
of caching layered classloaders. Instead, we will use the state's global
ClassLoaderCache. This is better both because it centralizes the
classloader caching and because the new ClassLoaderCache will evict
unused classloaders when the jvm is under memory pressure.
I also add a new layer for the resources that goes between the scala
library layer and the dependency layer. This should help in cases where
users depend on libraries that require access to resources, e.g.
logback.xml.
It was possible to make new classloaders for the scala library and other
jars with each new scala instance. To avoid this, I audited all of the
places within sbt where we make a ScalaInstance and ensure that we
instantiate them in such a way that the classloaders are retrieved
through the state's ClassLoaderCache.
After this change, I found from a heap dump that it was possible to run
test in a project that uses scala 2.12.8 and have only ONE classloader
for the scala library present in the heap dump. With older versions,
there were would be up to 3 or 4 in most heap dumps.
This commit adds a new ClassLoaderCache that builds on the
ClassLoaderCache that is present in zinc (and can be used to build an
instance of the zinc ClassLoaderCache to preserve compatibility). It
differs from the zinc classloader cache that it does not use direct
SoftReferences to classloaders. Instead, we create a wrapper loader
that can't load any classes and just delegates to its parent. This
allows us to add a thread that reaps the soft reference to the wrapper
loader. Crucially, we add a custom SoftReference class that has a strong
reference to the underlying classloader. This allows us to call close on
the strong reference.
The one issue with this approach is that we can't
rescue the jvm from crashing with an OOM: metaspace because the jvm
doesn't give us a chance to close and dereference the underlying
classloaders before it crashes. It WILL collect classloaders under
normal memory pressure, just not metaspace pressure. To fix this, I
check if the MaxMetaspaceSize is set via an MxBean and, if it is, we
fill the cache with regular soft references. We are going to change the
bash script to not set -XX:MaxMetaspaceSize by default so most builds
should probably end up correctly closing the classloaders after this
change. But we should break existing builds that set MaxMetaspaceSize
but don't crash.
As part of this commit, I audited all of the places where we were
instantiating ClassLoaderCache instances and instead pass in the
state's ClassLoaderCache instance. This reduces the total number of
classloaders created.
Using a lazy val causes the log manager to hold onto a reference to the
state. These would accumulate with each task evaluation. I found that
that in the beanpuree project, that if I ran compile 10 times in a row,
the heap usage was 40mb lower after this change.
This was problematic because it had no dependency on the compile task
which meant that any other task in the config would pick up those
fileOutputs which did not make sense. I noticed this because
(resources / outputFileStamps).value would include class files.
To minimize classloading and consistency between sbt instances launched
with the latest launcher compared to old launchers, I overhauled code
that replaces the app configuration and meta build classloader at
startup. The goals of this change for legacy launchers were:
1) Do not ever load the scala-library.jar from the app provider class loader.
2) Close the class loaders that are below the topLoader in the class
loading hierarcy
For the new launcher, we simply want to avoid modifying the loader at
all.
I added the SbtParserInit class so that it was more straightforward to
preload the global instance using reflection. We now use reflection to
instantiate an SbtParserInit instance for both the legacy and new
launcher cases to simplify the logic.
After this change, the legacy loader still uses somewhat more metaspace
than the new loader, but the difference seems to be O(10MB), which
should only impact projects that were close their MaxMetaspaceSize to
begin with.
I verified using javap that none of the code in this class uses the
scala standard library which should help metaspace since we don't load
much of the scala standard library until we enter xMainImpl.run.
I verified manually that ExternalHooks were still applied by default but
that I could set the incOptions in the Test and Compile configs so that
they weren't used.
Fixes#4624
Fixes#4712
This adds a specialized DependencyResolution instance called `scalaCompilerBridgeDependencyResolution` to download the compiler bridge. It has its own list of resolvers set by `scalaCompilerBridgeResolvers`. For backward compatibility, it will append `externalResolvers.value` as well.