In an interactive session, it's possible for task evaluation to trigger
an OOM: Metaspace but for sbt to continue working after that failure.
Moreover, the metaspace oom can be caused by using a dependency
classloader layer. If the user changes the layering strategy, they may
be able to re-run their command successfully.
Instead of caching based on the classpath of the resources, we should
instead cache based on the actual resource files. This commit achieves
that by adding the classpathFiles key which just transforms the
attributed classpath to a Seq[Path]. This implicitly generates the
outputFileStamps key for classpathFiles which we can use to read the
stamps (the file stamp entries for the classpath should get filled by
the compile task so this shouldn't actually cause any additional io).
The ShareRuntimeDependenciesLayerWithTestDependencies strategy doesn't
really work with resources, so it makes sense to get rid of it. Without
the share layer, there is no point in having separate
RuntimeDependencies and TestDependencies layers so I consolidated them
to Dependencies.
If we really care about binary/source compatibility for the 1.3.0-RCx
series, I can restore the traits and objects and set them private[sbt].
I think it was kind of a bug that they existed at all given the issue
with resources so it makes sense to just remove them.
This makes debugging a bit easier in the eclipse memory analyzer tool
since we get a more specific classloader type than URLClassLoader and by
giving the class a meaningful name, we can tell from where it
originated.
This commit removes the ClassLoaderCache that I'd added for the purpose
of caching layered classloaders. Instead, we will use the state's global
ClassLoaderCache. This is better both because it centralizes the
classloader caching and because the new ClassLoaderCache will evict
unused classloaders when the jvm is under memory pressure.
I also add a new layer for the resources that goes between the scala
library layer and the dependency layer. This should help in cases where
users depend on libraries that require access to resources, e.g.
logback.xml.
It was possible to make new classloaders for the scala library and other
jars with each new scala instance. To avoid this, I audited all of the
places within sbt where we make a ScalaInstance and ensure that we
instantiate them in such a way that the classloaders are retrieved
through the state's ClassLoaderCache.
After this change, I found from a heap dump that it was possible to run
test in a project that uses scala 2.12.8 and have only ONE classloader
for the scala library present in the heap dump. With older versions,
there were would be up to 3 or 4 in most heap dumps.
This commit adds a new ClassLoaderCache that builds on the
ClassLoaderCache that is present in zinc (and can be used to build an
instance of the zinc ClassLoaderCache to preserve compatibility). It
differs from the zinc classloader cache that it does not use direct
SoftReferences to classloaders. Instead, we create a wrapper loader
that can't load any classes and just delegates to its parent. This
allows us to add a thread that reaps the soft reference to the wrapper
loader. Crucially, we add a custom SoftReference class that has a strong
reference to the underlying classloader. This allows us to call close on
the strong reference.
The one issue with this approach is that we can't
rescue the jvm from crashing with an OOM: metaspace because the jvm
doesn't give us a chance to close and dereference the underlying
classloaders before it crashes. It WILL collect classloaders under
normal memory pressure, just not metaspace pressure. To fix this, I
check if the MaxMetaspaceSize is set via an MxBean and, if it is, we
fill the cache with regular soft references. We are going to change the
bash script to not set -XX:MaxMetaspaceSize by default so most builds
should probably end up correctly closing the classloaders after this
change. But we should break existing builds that set MaxMetaspaceSize
but don't crash.
As part of this commit, I audited all of the places where we were
instantiating ClassLoaderCache instances and instead pass in the
state's ClassLoaderCache instance. This reduces the total number of
classloaders created.
Using a lazy val causes the log manager to hold onto a reference to the
state. These would accumulate with each task evaluation. I found that
that in the beanpuree project, that if I ran compile 10 times in a row,
the heap usage was 40mb lower after this change.
This was problematic because it had no dependency on the compile task
which meant that any other task in the config would pick up those
fileOutputs which did not make sense. I noticed this because
(resources / outputFileStamps).value would include class files.
To minimize classloading and consistency between sbt instances launched
with the latest launcher compared to old launchers, I overhauled code
that replaces the app configuration and meta build classloader at
startup. The goals of this change for legacy launchers were:
1) Do not ever load the scala-library.jar from the app provider class loader.
2) Close the class loaders that are below the topLoader in the class
loading hierarcy
For the new launcher, we simply want to avoid modifying the loader at
all.
I added the SbtParserInit class so that it was more straightforward to
preload the global instance using reflection. We now use reflection to
instantiate an SbtParserInit instance for both the legacy and new
launcher cases to simplify the logic.
After this change, the legacy loader still uses somewhat more metaspace
than the new loader, but the difference seems to be O(10MB), which
should only impact projects that were close their MaxMetaspaceSize to
begin with.
I verified using javap that none of the code in this class uses the
scala standard library which should help metaspace since we don't load
much of the scala standard library until we enter xMainImpl.run.
I verified manually that ExternalHooks were still applied by default but
that I could set the incOptions in the Test and Compile configs so that
they weren't used.
Fixes#4624
Fixes#4712
This adds a specialized DependencyResolution instance called `scalaCompilerBridgeDependencyResolution` to download the compiler bridge. It has its own list of resolvers set by `scalaCompilerBridgeResolvers`. For backward compatibility, it will append `externalResolvers.value` as well.
Supershell was not reliably working and I tracked it down to
TaskProgress not actually publishing updates during task execution. This
seemed to happen because the background task was only run once when the
task started up. Once that task exited, no further task reports would be
published. The fix is to start a new thread every time we enter
EvaluateTask. I verified manually that it did not seem to leak threads
because EvaluateTask always calls shutdown, which calls
afterAllCompleted, which stops the progress thread.
I also decreased the default report period to 100ms. I can't imagine
that this will have a big effect on performance. It can be tuned with
the sbt.supershell.sleep parameter.
There are issues when using jdk > 8 where the rt.jar file can be
invalidated by ExternalHooks. This causes spurious rebuilds. I think
it's fair to assume that rt.jar never changes. If a dependency is named
rt.jar, then invalidation may not work correctly but I think that this
is the more important case to handle.
I verified that before this change, it was impossible to run
akka-actor/compile twice in a row using adopt jdk 11 and, after this
change, re-compilation worked as expected.
I discovered that some registered shutdown hooks would crash due to
67df72ab01 because they would try to load
classes from the closed classloader. To fix this, I add a internal
shutdown hooks mechanism that can be managed by sbt. Any unevaluated
shutdown hooks will be run when the sbt main method exits. This means
that they will be run when the user calls reboot. I think that is
reasonable.
I realized that all of the data structures that I needed to isolate the
classpath are contained in the AppProvider interface so there was no
need to use structural reflection on the top class loader.
It turns out that it can take roughly one second to instantiate a
scala.nsc.tools.Global instance for the first time. When sbt is starting
up, it also takes nearly 2 seconds to initialize logging. We can speed
up the boot time by doing these two things concurrently. On my machine,
I saw on average a 500ms decrease in startup time after this change.
I finally realized that the trick is that for non cygwin windows, the
available method on the jline wrapped input stream always returns zero.
Unlike on posix, however, the read method is interruptible which means
that we can just spin up a background thread that polls from the input
stream and writes it into a buffer.
I verified that it was no longer necessary to hit <enter> after 'r' to
rerun the continuous command on my windows vm after this change.
This commit finally fixes#241 by adding support for sbt to either
print a warning or automatically reload the project if the metabuild
sources have changed. To facilitate this, I introduce a new key,
metaBuildSourceOption which has three options:
1) IgnoreSourceChanges
2) WarnOnSourceChanges
3) ReloadOnSourceChanges
When the former is set, sbt will not check if the meta build sources
have changed. Otherwise, sbt will use the buildStructure / fileInputs to
get the ChangedFiles for the metabuild. If there are any changes, it
will either warn or reload the build depending on the value of
metaBuildSourceOption.
The mechanism for diffing the files is that I add a step to EvaluateTask
where, if the project has been loaded and
metaBuildSourceOption != IgnoreSourceChanges, we evaluate the needReload
task. If we need a reload, we return an error that indicates that a
Reload is necessary. When that error is detected, the MainLoop will
prepend "reload" to the pending commands for the state. Otherwise we
just print a warning and continue.
I benchmarked the overhead of this and it wasn't too bad. I generally
saw it taking 5-20ms to perform the check. Since this is only done once
per task evaluation run, I don't think it's a big deal. When
IgnoreSourceChanges is set, there is O(10us) overhead. If performance
does become a problem, we could add a global watch service and skip the
needReload evaluation if no files have been modified.
I removed the watchTrackMetaBuild key and made it so that the continuous
builds only track the meta build when
metaBuildSourceOption == ReloadOnSourceChanges
The persistentFileStampCache does seem to work pretty well but in case
users encounter issues, I add a boolean flag that allows the user to
turn this behavior off and always re-stamp every source file in every
task evaluation run.
Previously the persistent attribute map was only reset when the file
event monitor detected a change. This made it possible for the cache to
be inconsistent with the state of the file system*. To fix this, I add an
observer on the file tree repository used by the continuous build that
invalidates the cache entry for any path for which it detects a change.
Invalidating the cache does not stamp the file. That only happens either
when a task asks for the stamp for that file or when the file event
monitor reports an event and we must check if the file was updated or
not.
After this change, touching a source file will not trigger a build
unless the contents of the file actually changes.
I added a test that touches one source file in a project and updates the
content of the other. If the source file that is only ever touched ever
triggers a build, then the test fails.
* This could lead to under-compilation because ExternalHooks would not
detect that the file had been updated.
I had tried to be cute and only inject certain tasks if they're actually
used, but that made it so that dynamic tasks may not have be able to use
them.
The new io verion removes the PathFinder <-> Glob implicit translations.
It also has a number of small bug fixes related to directory listing via
FileTreeView.
The main project emits a number of deprecation warnings. I've isolated
the deprecation warnings related to Watch to the DeprecatedContinuous
file. I fixed the deprecation warnings where it was straightforward to
do so. After this change, there are three non-watch related changes
emitted:
1) Defaults.scala:3760 uses the deprecated InputTask.apply. This seems
fixable but I'm not in a hurry
2) oldLoadFailed and oldLastGrep are used by Main. I think this could
just be fixed by removing the deprecation warnings and setting them
private[sbt] since they will still be available in the shell.
I previously tried to fix https://github.com/sbt/sbt/issues/4608 in
fc715cab44 by finding the instance of
xsbt.boot.BootFilteredLoader in the classloader heirarchy. This was a
risky approach since it made a lot of assumptions about the classloaders
used to invoke xMain.run. Since the point is to filter out the scala
standard library jar, I reworked things to just find all the parents of
the scala provider loader and then walk the graph from the root
classloader until it finds the classloader that contains the scala
library. If no such classloader exists, it ends up returning the parent
of the scala provider library.
I also renamed the libraryLoader parameter to scalaProviderLoader since
that is what is actually passedin. It is actually the libraryLoader that
we want to exclude.
Ref https://github.com/sbt/sbt/issues/4661
local-preloaded-ivy contains dangling ivy.xml without JAR files.
We might include local-preloaded again once we have a preloaded in Maven layout.
I discovered there were a number of places where closing a ClassLoader
didn't work correctly because I was assuming it was a URLClassLoader
when it was actually a ClasspathFilter. I also incorrectly imported the
wrong kind of URLClassLoader in Run.scala. Finally, I close the
SbtMetaBuildClassLoader when xMain exits now.
I decided that there were too many settings related to the file
management that did similar things and had similar names but did
slightly different things. To improve this, I introduce the ChangedFiles
class to sbt.nio.file and switch to having just two task for file input
and output retrieval: all(Input|Output)Files and
changed(Input|Output)Files. If, for example, changedInputFiles returns
None that means that either the task has not yet been run or there were
no changes. If there have been any changes, then it will return
Some(changes) and the user can extract the relevant changes that they
are interested in.
The code may be slightly more verbose in a few places, but I think it's
worth it for the conceptual clarity.
This commit unifies my previous work for automatically watching the
input files for a task with support for automatically tracking and
cleaning up the output files of a task. The big idea is that users may
want to define tasks that depend on the file outputs of other tasks and
we may not want to run the dependent tasks if the output files of the
parent tasks are unmodified.
For example, suppose we wanted to make a plugin for managing typescript
files. There may be, say, two tasks with the following inputs and
outputs:
compileTypescript = taskKey[Unit]("shells out to compile typescript files")
fileInputs -- sourceDirectory / ** / "*.ts"
fileOutputs -- target / "generated-js" / ** / "*.js"
minifyGeneratedJS = taskKey[Path]("minifies the js files generated by compileTypescript to a single combined js file.")
dependsOn: compileTypeScript / fileOutputs
Given a clean build, the following should happen
> minifyGeneratedJS
// compileTypescript is run
// minifyGeneratedJS is run
> minifyGeneratedJS
// no op because nothing changed
> minifyGeneratedJS / clean
// removes the file returned by minifyGeneratedJS.previous
> minifyGeneratedJS
// re-runs minifyGeneratedJS with the previously compiled js artifacts
> compileTypescript / clean
// removes the generated js files
> minifyGeneratedJS
// compileTypescript is run because the previous clean removed the generated js files
// minifyGeneratedJS runs because the artifacts have changed
> clean
// removes the generated js files and the minified js file
> minifyGeneratedJS
// compileTypescript is run because the generated js files were
// minifyGeneratedJS is run both because it was removed and
Moreover, if compileTypescript fails, we want minifyGeneratedJS to fail
as well.
This commit makes this all possible. It adds a number of tasks to
sbt.nio.Keys that deal with the output files. When injecting settings, I
now identify all tasks that return Seq[File], File, Seq[Path] and Path
and create a hidden special task: dynamicFileOutputs: TaskKey[Seq[Path]]
This special task runs the underlying task and converts the result to
Seq[Path]. From there, we can have the tasks like changedOutputPaths
delegate to dynamicFileOutputs which, by proxy, runs the underlying
task. If any task in the input / output chain fails, the entire sequence
fails.
Unlike the fileInputs, we do not register the dynamicFileOutputs or
fileOutputs with continuous watch service so these paths will not
trigger a continuous build if they are modified. Only explicit unmanaged
input sources should should do that.
As part of this, I also added automatic generation of a custom clean task for
any task that returns Seq[File], File, Seq[Path] or Path. I also added
aggregation so that clean can be defined in a configuration or project
and it will automatically run clean for all of the tasks that have a
custom clean implementation in that task or project. The automatic clean
task will only delete files that are in the task target directory to
avoid accidentally deleting unmanaged files.
This commit refactors things so that the nio apis are located primarily
in the nio package. Because the nio keys are a first class sbt feature,
I had to add import sbt.nio._ and sbt.nio.Keys._ to the autoimports in
BuildUtil.scala
I had some ideas for allowing the user to get a copy of system in during
a continuous build but I can't really see a good use case now so I'm
going to remove it before 1.3.0.
In my recent changes to watch, I have been moving towards a world in
which sbt manages the file inputs and outputs at the task level. The
main idea is that we want to enable a user to specify the inputs and
outputs of a task and have sbt able to track those inputs across
multiple task evaluations. Sbt should be able to automatically trigger a
build when the inputs change and it also should be able to avoid task
evaluation if non of the inputs have changed.
The former case of having sbt automatically watch the file inputs of a
task has been present since watch was refactored. In this commit, I
make it possible for the user to retrieve the lists of new, modified and
deleted files. The user can then avoid task evaluation if none of the
inputs have changed.
To implement this, I inject a number of new settings during project
load if the fileInputs setting is defined for a task. The injected
settings are:
allPathsAndAttributes -- this retrieves all of the paths described by
the fileInputs for the task along with their attributes
fileStamps -- this retrieves all of the file stamps for the files
returned by allPathsAndAttributes
Using these two injected tasks, I also inject a number of derived tasks,
such as allFiles, which returns all of the regular files returned by
allPathsAndAttributes and changedFiles, which returns all of the regular
files that have been modified since the last run.
Using these injected settings, the user is able to write tasks that
avoid evaluation if the inputs haven't changed.
foo / fileInputs += baseDirectory.value.toGlob / ** / "*.scala"
foo := {
foo.previous match {
case Some(p) if (foo / changedFiles).value.isEmpty => p
case _ => fooImpl((foo / allFiles).value
}
}
To make this whole mechanism work, I add a private task key:
val fileAttributeMap = taskKey[java.util.HashMap[Path, Stamp]]("...")
This keeps track of the stamps for all of the files that are managed by
sbt. The fileStamps task will first look for the stamp in the attribute
map and, only if it is not present, it will update the cache. This
allows us to ensure that a given file will only be stamped once per task
evaluation run no matter how the file inputs are specified. Moreover, in
a continuous build, I'm able to reuse the attribute map which can
significantly reduce latency because the default file stamping
implementation used by zinc is fairly expensive (it can take anywhere
between 300-1500ms to stamp 5000 8kb source files on my mac).
I also renamed some of the watch related keys to be a bit more clear.
This trait may not even survive until 1.4.0. It should definitely not be
public. I got a little overexcited about programming with higher kinded
types when I added it.
The newest version of io repackages a number of classes into the
sbt.nio.* packages. It also changes some of the semantics of glob
related apis. This commit updates all of the usages of the updated apis
within sbt but should have no functional difference.
Since the new watch implementation has yet to be widely deployed, we
should hold off on deprecating the old keys. They could still be
deprecated in a patch release or in 1.4.0.
I noticed in a heap dump of sbt that there were many classloaders for
the scala instance. I then realized that we were making a new
classloader for the scala library on every test run. Even worse, the
ScalaInstanceLoader instance was never closed which lead to a metaspace
leak. I moved the scala instance classloader to the global classloader
cache. Not only will these be correctly cached, they will be closed if
evicted from the cache.
This adds dependency to LM implemented using Coursier.
I had to copy paste a bunch of code from sbt-coursier-shared to break the dependency to sbt.
`Global / useCoursier := false` or `-Dsbt.coursier=false` be used to opt-out of using Coursier for the dependency resolution.
Fixes#4438
This slims down update's UpdateReport by removing evicted modules
caller information. The larger the graph, the effect would be more
pronounced. For example, I saw a graph reduce from 5.9MB to 1.1MB in JSON file.
It was reported in https://github.com/sbt/sbt/issues/4608 that there was
a regression that tests run against scala 2.11 would fail. This was
because the interface loader incorrectly contained the scala library. To
fix this, I needed to find the xsbt.boot.BootFilteredLoader in the
classloading hierarchy and put the sbt testing interface library in
between that loader and the scala library loader.
We noticed that the community build was failing for some projects due to
some class loading issues. My initial approach for detecting the errors
didn't always work because the test framework might wrap the underlying
exception. To fix that, I add the causes to the list of throwables to
scan for class loading related exceptions. I also added
ClassNotFoundException to the list of types to check for. I additionally
added more context to the error message so that it is more clear to the
user what specifically went wrong. The error message is intended to
provide examples that the user can actually paste into the console.
There is also a lot of manual line wrapping that could be improved by
defining paragraphs and then splitting on the jline terminal width. That
could be a useful internal helper function to improve our log messages
in general.
The underlying issue could be addressed by allowing the user to specify
libraries that get excluded from the dependency classpath for layering
purposes. I'm not sure the best way to do that yet and adding that
feature wouldn't fix any existing builds so I think that would be better
handled in 1.4.0.
Prior to this commit, it was difficult to prevent the sbt metabuild
classpath from leaking into the runtime and test classpaths. The biggest
issue is that the test-inferface jar was located in the metabuild
classpath. We tried to prevent leakage using the DualClassLoader, but
this was an ugly solution that did not seem to work reliably. The fix is
to modify the actual sbt metabuild classloader provided by the sbt
launcher.
To do this, I add a new classloader SbtMetaClassLoader that isolates the
test-interface jar from the rest of the classpath. I modify xMain to
create a new AppConfiguration that uses this new classloader and
use reflection to invoke the sbt main method using the new classloader.
Not only do I think that this is a much saner solution than DualLoaders,
I accidentally fixed#4575 with this change.
It isn't possible to share the runtime and test layers correctly with
bgCopyClasspath is used because the runtime classpath uses the
dependencies copied to the boot directory while the test classpath uses
the classes in target and .ivy2. Since this is not the default and users
have to opt in to
ClassLoaderLayeringStrategy.ShareRuntimeDependenciesLayerWithTestDependencies,
I think this is fine.