At the suggestion of @eed3si9n, instead of specifying the file cache
size in bytes, we now specify it in a formatted string. For example,
instead of specifying 128 megabytes in bytes (134217728), we can specify
it with the string "128M".
It can be quite slow to read and parse a large json file. Often, we are
reading and writing the same file over and over even though it isn't
actually changing. This is particularly noticeable with the
UpdateReport*. To speed this up, I introduce a global cache that can be
used to read values from a CacheStore. When using the cache, I've seen
the time for the update task drop from about 200ms to about 1ms. This
ends up being a 400ms time savings for test because update is called for
both Compile / compile and Test / compile.
The way that this works is that I add a new abstraction
CacheStoreFactoryFactory, which is the most enterprise java thing I've
ever written. We store a CacheStoreFactoryFactory in the sbt State.
When we make Streams for the task, we make the Stream's
cacheStoreFactory field using the CacheStoreFactoryFactory. The
generated CacheStoreFactory may or may not refer to a global cache.
The CacheStoreFactoryFactory may produce CacheStoreFactory instances
that delegate to a Caffeine cache with a max size parameter that is
specified in bytes by the fileCacheSize setting (which can also be set
with -Dsbt.file.cache.size). The size of the cache entry is estimated by
the size of the contents on disk. Since we are generally storing things
in the cache that are serialized as json, I figure that this should be a
reasonable estimate. I set the default max cache size to 128MB, which is
plenty of space for the previous cache entries for most projects. If the
size is set to 0, the CacheStoreFactoryFactory generates a regular
DirectoryStoreFactory.
To ensure that the cache accurately reflects the disk state of the
previous cache (or other cache's using a CacheStore), the Caffeine cache
stores the last modified time of the file whose contents it should
represent. If there is a discrepancy in the last modified times (which
would happen if, say, clean has been run), then the value is read from
disk even if the value hasn't changed.
* With the following build.sbt file, it takes roughly 200ms to read and
parse the update report on my compute:
libraryDependencies += "org.apache.spark" %% "spark-sql" % "2.4.3"
libraryDependencies += "org.scalatest" %% "scalatest" % "3.0.1"
This is because spark-sql has an enormous number of dependencies and the
update report ends up being 3MB.
Fixes#4802
For Ivy integration sbt uses credential task in a peculiar way. 9fa25de84c/main/src/main/scala/sbt/Defaults.scala (L2271-L2275)
This lets the build user put `credential` task in various places, like metabuild or root project, but they would all act as if they were scoped globally. This PR adds `allCredentials` task to emulate the behavior to pass credentials into lm-coursier.
This Resolver had the same name as the typesafe ivy resolver specified
in the launcher boot.properties. It was creating a number of verbose
warnings about having multiple resolvers with the same name. I noticed
that the ivy pattern is slightly different for the boot resolver with
this name. It didn't seem to be causing any problems to have both
resolvers.
Fixes#4839
I noticed that for a simple spark project that evaluating the test task
was faster than running run when both tasks evaluated the same code
block. I tracked this down to the BackgroundJobService.copyClasspath
method. This method was hashing the jar contents of all of the files in
the build. On my computer, this took 600ms (for context, the total run
time of the `run` task was around 1.2 seconds, which included about
150ms of scala compiling and 350ms of time in the main method). If
instead we use the last modified time it drops down to 5-10ms. As
predicted, the total runtime of `run` in this project dropped down to
600ms which was on par with `test`.
I am not sure why a hash was used rather than last modified in the first place,
so I reworked things in such a way that, by default, sbt will use a hash
but if turbo mode is on, it will use the last modified instead. We can
revisit the default later.
The multi parser had very poor performance if there were many commands.
Evaluating the expansion of something like "compile;" * 30 could cause
sbt to hang indefinitely. I believe this was due to excessive
backtracking due to the optional `(parser <~ semi.?).?` part of the
parser in the non-leading semicolon case.
I also reworked the implementation so that the multi command now has a
name. This allows us to partition the commands into multi and non-multi
commands more easily in State while still having multi in the command
list. With this change, builds and plugins can exclude the multi parser
if they wish.
Using the partitioned parsers, I removed the high/priority low priority
distinction. Instead, I made it so that the multi command will actually
check if the first command is a named command, like '~'. If it is, it
will pass the raw command argument with the named command stripped out
into the parser for the named command. If that is parseable, then we
directly apply the effect. Otherwise we prefix each multi command to the
state.
The `session save` command has the side effect of modifying a "*.sbt"
file so we don't want to warn about changes or automatically reload when
we return to the shell. Setting the hasCheckedMetaBuild attribute key to
false is sufficient to prevent this.
Ref: https://github.com/sbt/sbt/issues/4813
It didn't really make sense for Continuous to use the other command
parser and then reparse the results. I was able to slightly simplify
things by using the multi parser directly.
It was reported in https://github.com/sbt/sbt/issues/4808 that compared
to 1.2.8, sbt 1.3.0-RC2 will truncate the command args of an input task
that contains semicolons. This is actually intentional, but not
completely robust. For sbt >= 1.3.0, we are making ';' syntactically
meaningful. This means that it always represents a command separator
_unless_ it is inside of a quoted string. To enforce this, the multi parser
will effectively split the input on ';', it will then validate that each
command that it extracted is valid. If not, it throws an exception. If
the input is not a multi command, then parsing fails with a normal
failure.
I removed the multi command from the state's defined commands and reworked
State.combinedParser to explicitly first try multi parsing and fall back
to the regular combined parser if it is a regular command. If the multi
parser throws an uncaught exception, parsing fails even if the regular
parser could have successfully parsed the command. The reason is so that
we do not ever allow the user to evaluate, say 'run a;b'. Otherwise the
behavior would be inconsitent when the user runs 'compile; run a;b'
It's very expensive to compute the hash code of a deeply nested
Incomplete. To prevent a loop, we only want to check for object equality
which we can do with IdentityHashMap
It didn't make sense to aggregate the watch start command if it was
defined in multiple sources so we previously just fell back to the
default message if multiple commands were being run. This, however,
meant that if you ran, say, ~compile in an aggregate project, it wasn't
possible to customize the start message. There was a message in the
sbt gitter channel where someone found the new message too verbose and
wanted to print something shorter and I realized that this was an
unfortunate restriction. Instead of giving up, we can just use the
project's watchStartMessage as a default. If the watchStartMessage
setting is unset for some reason, we can fall back to the default.
I validated this change manually in the swoval project, which has an
aggregate root project, by running
set ThisBuild / watchStartMessage := { (_, _, _) => None }
and indeed nothing was printed after each task evaluation in '~compile'.
We discovered that turbo mode did not work in the sbt settings project.
I tracked this down to the dependency classloader bundle not being
thread safe.
I realized that some builds may crash if we automatically close the
classloaders. While I do think that is a good thing in general that we
are closing the loaders by default, we shuold have an option for
supressing this behavior.
I made all of the custom classloaders that we define for test and run
check this property before calling the super.close method.
We want to notify users about the new features available in sbt 1.3.0 to
increase visibility. Turbo mode especially can benefit many builds, but
we have opted to leave it off by default for now.
The banner will be displayed the first time sbt enters the shell command
on each sbt run. The banner can be disabled globally with the sbt.banner
system property. It can be displayed on a per sbt version by running the
skipWelcomeBanner command. That command touches a file in the ~/.sbt/1.0
directory to make it persistent across all projects.
I decided creations/deletions/updates were a bit too technical rather
than descriptive. It also wasn't really correct to say Meta build
sources because the meta build is the build for the build. Instead, I
dropped Meta from the sentence. I also made the instructions when
changed sources are detected more active. I left them capitalized since
they are instructions rather than warnings.
Apply these changes by running `reload`.
Automatically reload the build when source changes are detected by setting `Global / onChangedBuildSource := ReloadOnSourceChanges`.
Disable this warning by setting `Global / onChangedBuildSource := IgnoreSourceChanges`.
Also indentation was wrong for the printed files when multiple files had
changed because the mkString middle argument was " \n" rather than "\n ".
This creates a performance mode that enables experimental or advanced features that might require some debugging by the build user when it doesn't work.
Initially we are putting the layered ClassLoader (`ClassLoaderLayeringStrategy.AllLibraryJars`) behind this flag.
My first attempt, cc8c66c66d, at making
java reflection work with a layered classloader with a dependency jar
layer was a failure. It would generally work ok the first time the user
ran test, but was likely to fail on a second run.
There were a number of problems with the strategy:
1) It relied on the thread's context class loader to determine where to
attempt the reverse lookup.
2) It is not possible to ever reload classes loaded by a classloader.
Consider the classloading hierarchy a <- b, where
the arrow implies that a is the parent of b. I incorrectly thought
that a's loadClass method would be called every time a class loaded
by a made a call to Class.forName(name: String). This turns out to
not be the case. As a result, the second time the dependency layer
was used, where now the hierarchy is a <- c, that same Class.forName
call could return a Class from b which causes a nasty crash.
It isn't possible to work around the limitation in 2 so the only option
that allows both caching and java reflection across layers to work is to
cache the dependency layer, but invalidate when cross layer reflection
occurs. This turns out to be straightforward to implement. The
performance looks very similar to the ScalaLibrary strategy when java
reflection is used, which makes sense because the scala library and
scala reflect layers are still reused when the dependency layer is
invalidated.
I also stopped passing around the resource map to all of the layers.
Resource loading is hierarchical and the resource layer comes above the
dependency layer in the stack so there was no need for the bottom layers
to also be RawResource loaders.
While testing some classloader changes, I realized that we didn't always
close the test classloaders because they didn't necessarily extend
URLClassLoader, but instead implemented AutoCloseable.
Bonus: don't set the context classloader. It turns out that the test
framework does that anyway inside of trl.run so it was pointless to do
in Defaults.scala.
The Reload exception that I added in the sbt package really wasn't
intended to be public. It's only meant to be used by
checkMetaBuildSources, which the users shouldn't override. I put it in
the top package though because I wanted it to be next to FullReload. I
also am not sure why the Reload object in Watch was private[sbt], but
while writing documentation, I realized that users couldn't access it.
While writing documentation for the watch subsystem, I realized that
it's awkward to configure watch to clear the screen before task
evaluation. To make this easier, I added a setting watchBeforeCommand
which is an arbitrary function that will run before the watch process
evaluates the command(s).
I also added helper functions for adding clear screen functionality.
I also realized that we weren't using the watchOnEnter or
watchOnExit callbacks anywhere. I had added these to support setting up
some state before watch starts and cleaning it up before it exits for
plugin authors. It makes sense to remove that functionality for 1.3.0
and only if a need presents itself re-add it in a later version of sbt.
I also made a few apis private[sbt] that I'm not sure about. Writing
documentation made me realize that some of these are redundant and/or
not ready for general consumption.
I had previously set some of the watch settings at the global level and
some at the project level. While writing documentation for the new watch
subsystem, I realized that the defaults should be set globally so that
they can be overridden at the ThisBuild level.
I also moved watchTriggers to sbt.nio.Keys. It was an oversight that it
wasn't moved there in a5cefd45be.
The classpath filter for test and run was added in #661 to ensure that
the classloaders were correctly isolated. That is no longer necessary
with the new layering strategies that are more precise about which jars
are exposed at each level. Using the filtered classloader was causing
the reflection used by spark to fail when using java 11.
We were incorrectly building the dependency layer in the run task using
the raw jars from dependencyClasspath rather than the actual classpath
jars (which may be different if bgCopyClasspath is true -- which it is
by default). This was preventing spark from working with AllLibraryJars
because it would load its classes and resources from the coursier cache
but the classpath filter would reject the resources because they came
from the coursier cache instead of the classpath.
The docs for ClassLoader,
https://docs.oracle.com/javase/8/docs/api/java/lang/ClassLoader.html
say that all non-hierarchical custom classloaders should be registered
as parallel capable. The docs also suggest that custom classloaders
should try to only override findClass so I reworked LayerdClassLoader to
only override findClass. I also added locking to the class loading to
make it safe for concurrent loading.
All of the custom classloaders besides LayeredClassLoader either
subclass URLClassLoader or LayeredClassLoader but don't override
loadClass. Because those two classloaders are parallel capable, the
subclasses should be as well. It isn't possible to make classloaders
that are implemented in scala parallel capable because scala 2 doesn't
support jvm static blocks (dotty does support this with an annotation).
To work around this, I re-worked some of the classloaders so that they
are either directly implemented in java or I subclassed a scala
implementation class in java.
Jave reflection did not work with layered classloaders if a dependency
attempted to load a class that was below the dependency layer in the
layered classloader hierarchy. The underlying problem was (in general) a
call to Class.forName somewhere. If the classloader parameter is not
specified, then Class.forName locates the ClassLoader for the caller
using reflection. It ultimately delegates to that ClassLoader's
loadClass method. With the previous LayeredClassLoader class, there was
no way for that classloader to access a URL that was below it in the
class loading hierarchy. I reworked LayeredClassLoader so that if it
fails to load the class, it will check the Thread's context classloader
and see if there are other LayeredClassLoader instances below it. If so,
it will then check if any of those classloaders would be able to load
the class by using findResource. If the descendant loader can load the
class, then we manually load it with findClass.
For best caching performance, we want to use the scala-reflect.jar that
is found in the scala instance. Also, in the runtime configuration,
caching didn't work correctly because we filtered the scala reflect
library from the dependency jars. We really only wanted to filter out
the library jars.
It also was problematic to use a LayeredClassLoader for the scala
reflect layer because in a subsequent commit I add the capability for a
layered classloader to load classes from its descendant loaders. This
caused problems when the scala-reflect layer was a LayeredClassLoader.
Instead, I add the ScalaReflectClassLoader class for better reporting.
I had previously not though there was much reason to support commands in
continuous builds. This was primarily because there were a number of
questions regarding semantics. Commands cannot have fileInputs
specifically assigned to them because they don't have an associated
scope. They also can arbitrarily modify state so what is the expectation
when running ~foo where foo is a command that, for example, replaces
itself. I settled on the following semantics:
1) Commands run in a continuous build cannot modify the sbt execution
state which is to say that the state that is returned by continuous
is the same that was passed in (unless a reload occurred or we exited
the build with an exception)
2) Any global watchTriggers or fileInputs apply to a watched command.
They automatically inherit any fileInputs that are queried when
running tasks in a command. So, for example, ~+compile does what
you'd expect.
The implementation is fairly straightforward. If we can successfully
parse a command, but we cannot parse a scopedKey from it, we assign it a
private ScopedKey. When computing the watch settings for that key, we
will select the global settings through delegation. This is how it picks
up the global watchTriggers.
To run the command, I had to rework the task evaluation portion because
a command may return a state with additional commands to run. The cross
build command works this way. We recursively run all of the commands
starting with the original until we run out of commands to run. As part
of this work, I was able to remove the three argument version of
Command.processCommand that I'd previously added to support my old
approach to evaluating commands. This was a nice bonus.
I added scripted tests that check that global watchTriggers are picked
up and that commands that delegate to a command that uses fileInputs
automatically pick up those inputs during the watch. I also added a test
that approximates the ~+compile use case and ensures that the failure
semantics are what we expect and that the task runs for all defined
scala versions.
I noticed that sbt 1.3.0 was using more cpu when idling (either at the
shell or while waiting for file events) than 1.2.8. This was because I'd
reduced a number of timeouts to 2 milliseconds which was causing a
thread to keep waking up every 2 milliseconds to poll a queue. I thought
that this was cheaper than it actually is and drove the cpu utilization
to O(10%) of a cpu on my mac.
To address this, I consolidated a number of queues into a single queue
in CommandExchange and Continuous. In the CommandExchange case, I
reworked CommandChannel to have a register method that passes in a Queue
of CommandChannels. Whenever it appends an exec, it adds itself to the
queue. CommandExchange can then poll that queue directly and poll the
returned CommandChannel for the actual exec. Since the main thread is
blocking on this queue, it does not need to frequently wake up and can
just poll more or less indefinitely until a message is received. This
also reduces average latency compared to older versions of sbt since
messages will be processed almost as soon as they are received.
The continuous case is slightly more complicated because we are polling
from two sources, stdin and FileEventMonitor. In my ideal world, I'd
have a reactive api for both of those sources and they would just write
events to a shared queue that we could block on. That is nontrivial to
implement, so instead I consolidated the FileEventMonitor instances into
a single FileEventMonitor. Since there is now only one FileEventMonitor
queue, we can block on that queue for 30 milliseconds and the poll
stdin. This reduces cpu utilization to O(2%) on my machine while still
having reasonably low latency for key input events (the latency of file
events should be close to zero since we are usually polling the
FileEventMonitor queue when waiting).
I actually had a TODO about the FileEventMonitor change that this
resolves.
I noticed that ~run in the Play plugin relied on the presence of the
ContinuousEventMonitor key. Rather than completely break that feature, I
re-added the ContinuousEventMonitor attribute to the state in a
continuous build. That being said, the play team does need to update
their plugin because reading from the console no longer works in 1.3.0 so the
user has to Ctrl-C to exit the watch. I think the best way for them to
fix this is to override the '~' command in their plugin and if the input
is 'run', then they do their custom thing, otherwise they delegate to
the default '~' command.
Not caching scala reflect is extremely painful if the build uses
scalatest. It adds O(1second) to my watch performance benchmarks. It
actually made sbt 1.3.0 much slower than 0.13.17
I was benchmarking sbt with turbo mode on and found that tests weren't
running. This was because we were inadvertently excluding all of the
dependency jars from the dynamic classpath. I have no idea why the
scripted tests didn't catch this.
The scalatest scripted test didn't catch this because 'test' just
automaticaly succeeds if no test frameworks are found. To guard against
regression, I had to ensure that 'test' failed for every strategy if a
bad test file was present.
We want to check the build sources before any command runs, not just
tasks. To achieve this, I moved the logic for checking for build source
changes to CommandProcess.processCommand. Also, @smarter had noticed
that if a user modified a build file and then ran reload, a warning
would be displayed about changed build sources even though they had just
ran reload. This was because running reload didn't update the previous
cache for checkBuildSources / fileInputStamps. I fixed that bug by
running 'checkBuildSources / changedInputFiles' instead of
'checkBuildSources' when the user runs reload.
I verified that after this change:
- If I changed a build file and ran 'show version' a warning was printed
before it displayed the version. If I also set
global / onChangedBuildSource := ReloadOnSourceChanges, it
automatically reloaded before displaying the version.
- If I changed a build source and ran 'reload', followed by
'show version', no warnings were ever displayed.
As an implementation detail, I had to add the Aggregation.suppressShow
attribute key. We set this key to true before checking the build
sources. Without this, log.success is called whenever we check the build
sources which is both confusing and noisy.
Because we are sharing the scala library classloader with test and run,
it is possible that sbt will be competing with for resources with the
test and run tasks when trying to get threads from the global execution
context. Also, by using our own execution context, we can shut it down
when sbt exits.
The motivation for this change is that I was looking at the active jvm
threads of an idle sbt process and noticed a bunch of global execution
context threads.
@olegych reported in #4721 that play projects would get stuck in a
strange loop where modifying any source file would cause that source
file to always be recompiled every time a build was triggered regardless
of whether or not it was modified. This was because the play project
sets custom watchSources (using the legacy api) that overlap with the
fileInputs.
There were two parts to this fix:
1) When detecting an event, find if any of the dynamic inputs that cover
the glob use a hash. If so, these are file inputs so we want to
update the hash for the path, not the last modified time.
2) Only write hashes into the persistent file stamp cache. Computing the
last modified time is much cheaper than the hash so it makes sense to
avoid ever caching last modified times.
I wrote a scripted test that fails if Continuous writes a last modified
time into the file stamp cache instead of a hash. I also verified
manually that a sample play project no longer exhibits the weird
recompilation behavior.
Fixes#4721
This allows the user to do, for example,
watchTriggeredMessage := { (count, path, commands) =>
println(Watched.clearScreen)
watchTriggeredMessage.value(count, path, commands)
}
Also, there was a bug where I accidentally inadvertently used the
deprecated watch message setting where I meant to use the deprecated
trigger message setting.
Fixes#4696
@olegych reported in https://github.com/sbt/sbt/issues/4722 that
sometimes even when a build was triggered during watch that no
recompilation would occur. The cause of this was that we never
invalidated the file stamp cache for managed sources or output files.
The optimization of persisting the source file stamps between task
evaluations in a continuous build only really makes sense for unmanaged
sources. We make the implicit assumption that unmanaged sources are
infrequently updated and generally one at a time. That assumption does
not hold for unmanaged source or output files.
To fix this, I split the fileStampCache into two caches: one for
unmanaged sources and one for everything else. We only persist the
unmanagedFileStampCache during continuous builds. The
managedFileStampCache gets invalidated every time.
I added a scripted test that simulates changing a generated source file.
Prior to this change, the test would fail because the file stamp was not
invalidated for the new source file content.
Fixes#4722
@japgolly reported in #4695 that aliased commands don't work in watch
anymore. This was because we were extracting the task from the raw
command rather than the aliased command. Since the alias wasn't a valid
key, we weren't able to parse the scoped key. The fix is to find the
aliased value and try that if we fail to parse the original command.
Fixes#4695
In an interactive session, it's possible for task evaluation to trigger
an OOM: Metaspace but for sbt to continue working after that failure.
Moreover, the metaspace oom can be caused by using a dependency
classloader layer. If the user changes the layering strategy, they may
be able to re-run their command successfully.
Instead of caching based on the classpath of the resources, we should
instead cache based on the actual resource files. This commit achieves
that by adding the classpathFiles key which just transforms the
attributed classpath to a Seq[Path]. This implicitly generates the
outputFileStamps key for classpathFiles which we can use to read the
stamps (the file stamp entries for the classpath should get filled by
the compile task so this shouldn't actually cause any additional io).
The ShareRuntimeDependenciesLayerWithTestDependencies strategy doesn't
really work with resources, so it makes sense to get rid of it. Without
the share layer, there is no point in having separate
RuntimeDependencies and TestDependencies layers so I consolidated them
to Dependencies.
If we really care about binary/source compatibility for the 1.3.0-RCx
series, I can restore the traits and objects and set them private[sbt].
I think it was kind of a bug that they existed at all given the issue
with resources so it makes sense to just remove them.
This makes debugging a bit easier in the eclipse memory analyzer tool
since we get a more specific classloader type than URLClassLoader and by
giving the class a meaningful name, we can tell from where it
originated.
This commit removes the ClassLoaderCache that I'd added for the purpose
of caching layered classloaders. Instead, we will use the state's global
ClassLoaderCache. This is better both because it centralizes the
classloader caching and because the new ClassLoaderCache will evict
unused classloaders when the jvm is under memory pressure.
I also add a new layer for the resources that goes between the scala
library layer and the dependency layer. This should help in cases where
users depend on libraries that require access to resources, e.g.
logback.xml.
It was possible to make new classloaders for the scala library and other
jars with each new scala instance. To avoid this, I audited all of the
places within sbt where we make a ScalaInstance and ensure that we
instantiate them in such a way that the classloaders are retrieved
through the state's ClassLoaderCache.
After this change, I found from a heap dump that it was possible to run
test in a project that uses scala 2.12.8 and have only ONE classloader
for the scala library present in the heap dump. With older versions,
there were would be up to 3 or 4 in most heap dumps.
This commit adds a new ClassLoaderCache that builds on the
ClassLoaderCache that is present in zinc (and can be used to build an
instance of the zinc ClassLoaderCache to preserve compatibility). It
differs from the zinc classloader cache that it does not use direct
SoftReferences to classloaders. Instead, we create a wrapper loader
that can't load any classes and just delegates to its parent. This
allows us to add a thread that reaps the soft reference to the wrapper
loader. Crucially, we add a custom SoftReference class that has a strong
reference to the underlying classloader. This allows us to call close on
the strong reference.
The one issue with this approach is that we can't
rescue the jvm from crashing with an OOM: metaspace because the jvm
doesn't give us a chance to close and dereference the underlying
classloaders before it crashes. It WILL collect classloaders under
normal memory pressure, just not metaspace pressure. To fix this, I
check if the MaxMetaspaceSize is set via an MxBean and, if it is, we
fill the cache with regular soft references. We are going to change the
bash script to not set -XX:MaxMetaspaceSize by default so most builds
should probably end up correctly closing the classloaders after this
change. But we should break existing builds that set MaxMetaspaceSize
but don't crash.
As part of this commit, I audited all of the places where we were
instantiating ClassLoaderCache instances and instead pass in the
state's ClassLoaderCache instance. This reduces the total number of
classloaders created.
Using a lazy val causes the log manager to hold onto a reference to the
state. These would accumulate with each task evaluation. I found that
that in the beanpuree project, that if I ran compile 10 times in a row,
the heap usage was 40mb lower after this change.
This was problematic because it had no dependency on the compile task
which meant that any other task in the config would pick up those
fileOutputs which did not make sense. I noticed this because
(resources / outputFileStamps).value would include class files.
To minimize classloading and consistency between sbt instances launched
with the latest launcher compared to old launchers, I overhauled code
that replaces the app configuration and meta build classloader at
startup. The goals of this change for legacy launchers were:
1) Do not ever load the scala-library.jar from the app provider class loader.
2) Close the class loaders that are below the topLoader in the class
loading hierarcy
For the new launcher, we simply want to avoid modifying the loader at
all.
I added the SbtParserInit class so that it was more straightforward to
preload the global instance using reflection. We now use reflection to
instantiate an SbtParserInit instance for both the legacy and new
launcher cases to simplify the logic.
After this change, the legacy loader still uses somewhat more metaspace
than the new loader, but the difference seems to be O(10MB), which
should only impact projects that were close their MaxMetaspaceSize to
begin with.
I verified using javap that none of the code in this class uses the
scala standard library which should help metaspace since we don't load
much of the scala standard library until we enter xMainImpl.run.
I verified manually that ExternalHooks were still applied by default but
that I could set the incOptions in the Test and Compile configs so that
they weren't used.
Fixes#4624
Fixes#4712
This adds a specialized DependencyResolution instance called `scalaCompilerBridgeDependencyResolution` to download the compiler bridge. It has its own list of resolvers set by `scalaCompilerBridgeResolvers`. For backward compatibility, it will append `externalResolvers.value` as well.
Supershell was not reliably working and I tracked it down to
TaskProgress not actually publishing updates during task execution. This
seemed to happen because the background task was only run once when the
task started up. Once that task exited, no further task reports would be
published. The fix is to start a new thread every time we enter
EvaluateTask. I verified manually that it did not seem to leak threads
because EvaluateTask always calls shutdown, which calls
afterAllCompleted, which stops the progress thread.
I also decreased the default report period to 100ms. I can't imagine
that this will have a big effect on performance. It can be tuned with
the sbt.supershell.sleep parameter.
There are issues when using jdk > 8 where the rt.jar file can be
invalidated by ExternalHooks. This causes spurious rebuilds. I think
it's fair to assume that rt.jar never changes. If a dependency is named
rt.jar, then invalidation may not work correctly but I think that this
is the more important case to handle.
I verified that before this change, it was impossible to run
akka-actor/compile twice in a row using adopt jdk 11 and, after this
change, re-compilation worked as expected.
I discovered that some registered shutdown hooks would crash due to
67df72ab01 because they would try to load
classes from the closed classloader. To fix this, I add a internal
shutdown hooks mechanism that can be managed by sbt. Any unevaluated
shutdown hooks will be run when the sbt main method exits. This means
that they will be run when the user calls reboot. I think that is
reasonable.
I realized that all of the data structures that I needed to isolate the
classpath are contained in the AppProvider interface so there was no
need to use structural reflection on the top class loader.
It turns out that it can take roughly one second to instantiate a
scala.nsc.tools.Global instance for the first time. When sbt is starting
up, it also takes nearly 2 seconds to initialize logging. We can speed
up the boot time by doing these two things concurrently. On my machine,
I saw on average a 500ms decrease in startup time after this change.
I finally realized that the trick is that for non cygwin windows, the
available method on the jline wrapped input stream always returns zero.
Unlike on posix, however, the read method is interruptible which means
that we can just spin up a background thread that polls from the input
stream and writes it into a buffer.
I verified that it was no longer necessary to hit <enter> after 'r' to
rerun the continuous command on my windows vm after this change.
This commit finally fixes#241 by adding support for sbt to either
print a warning or automatically reload the project if the metabuild
sources have changed. To facilitate this, I introduce a new key,
metaBuildSourceOption which has three options:
1) IgnoreSourceChanges
2) WarnOnSourceChanges
3) ReloadOnSourceChanges
When the former is set, sbt will not check if the meta build sources
have changed. Otherwise, sbt will use the buildStructure / fileInputs to
get the ChangedFiles for the metabuild. If there are any changes, it
will either warn or reload the build depending on the value of
metaBuildSourceOption.
The mechanism for diffing the files is that I add a step to EvaluateTask
where, if the project has been loaded and
metaBuildSourceOption != IgnoreSourceChanges, we evaluate the needReload
task. If we need a reload, we return an error that indicates that a
Reload is necessary. When that error is detected, the MainLoop will
prepend "reload" to the pending commands for the state. Otherwise we
just print a warning and continue.
I benchmarked the overhead of this and it wasn't too bad. I generally
saw it taking 5-20ms to perform the check. Since this is only done once
per task evaluation run, I don't think it's a big deal. When
IgnoreSourceChanges is set, there is O(10us) overhead. If performance
does become a problem, we could add a global watch service and skip the
needReload evaluation if no files have been modified.
I removed the watchTrackMetaBuild key and made it so that the continuous
builds only track the meta build when
metaBuildSourceOption == ReloadOnSourceChanges
The persistentFileStampCache does seem to work pretty well but in case
users encounter issues, I add a boolean flag that allows the user to
turn this behavior off and always re-stamp every source file in every
task evaluation run.
Previously the persistent attribute map was only reset when the file
event monitor detected a change. This made it possible for the cache to
be inconsistent with the state of the file system*. To fix this, I add an
observer on the file tree repository used by the continuous build that
invalidates the cache entry for any path for which it detects a change.
Invalidating the cache does not stamp the file. That only happens either
when a task asks for the stamp for that file or when the file event
monitor reports an event and we must check if the file was updated or
not.
After this change, touching a source file will not trigger a build
unless the contents of the file actually changes.
I added a test that touches one source file in a project and updates the
content of the other. If the source file that is only ever touched ever
triggers a build, then the test fails.
* This could lead to under-compilation because ExternalHooks would not
detect that the file had been updated.
I had tried to be cute and only inject certain tasks if they're actually
used, but that made it so that dynamic tasks may not have be able to use
them.
The new io verion removes the PathFinder <-> Glob implicit translations.
It also has a number of small bug fixes related to directory listing via
FileTreeView.
The main project emits a number of deprecation warnings. I've isolated
the deprecation warnings related to Watch to the DeprecatedContinuous
file. I fixed the deprecation warnings where it was straightforward to
do so. After this change, there are three non-watch related changes
emitted:
1) Defaults.scala:3760 uses the deprecated InputTask.apply. This seems
fixable but I'm not in a hurry
2) oldLoadFailed and oldLastGrep are used by Main. I think this could
just be fixed by removing the deprecation warnings and setting them
private[sbt] since they will still be available in the shell.
I previously tried to fix https://github.com/sbt/sbt/issues/4608 in
fc715cab44 by finding the instance of
xsbt.boot.BootFilteredLoader in the classloader heirarchy. This was a
risky approach since it made a lot of assumptions about the classloaders
used to invoke xMain.run. Since the point is to filter out the scala
standard library jar, I reworked things to just find all the parents of
the scala provider loader and then walk the graph from the root
classloader until it finds the classloader that contains the scala
library. If no such classloader exists, it ends up returning the parent
of the scala provider library.
I also renamed the libraryLoader parameter to scalaProviderLoader since
that is what is actually passedin. It is actually the libraryLoader that
we want to exclude.
Ref https://github.com/sbt/sbt/issues/4661
local-preloaded-ivy contains dangling ivy.xml without JAR files.
We might include local-preloaded again once we have a preloaded in Maven layout.
I discovered there were a number of places where closing a ClassLoader
didn't work correctly because I was assuming it was a URLClassLoader
when it was actually a ClasspathFilter. I also incorrectly imported the
wrong kind of URLClassLoader in Run.scala. Finally, I close the
SbtMetaBuildClassLoader when xMain exits now.
I decided that there were too many settings related to the file
management that did similar things and had similar names but did
slightly different things. To improve this, I introduce the ChangedFiles
class to sbt.nio.file and switch to having just two task for file input
and output retrieval: all(Input|Output)Files and
changed(Input|Output)Files. If, for example, changedInputFiles returns
None that means that either the task has not yet been run or there were
no changes. If there have been any changes, then it will return
Some(changes) and the user can extract the relevant changes that they
are interested in.
The code may be slightly more verbose in a few places, but I think it's
worth it for the conceptual clarity.
This commit unifies my previous work for automatically watching the
input files for a task with support for automatically tracking and
cleaning up the output files of a task. The big idea is that users may
want to define tasks that depend on the file outputs of other tasks and
we may not want to run the dependent tasks if the output files of the
parent tasks are unmodified.
For example, suppose we wanted to make a plugin for managing typescript
files. There may be, say, two tasks with the following inputs and
outputs:
compileTypescript = taskKey[Unit]("shells out to compile typescript files")
fileInputs -- sourceDirectory / ** / "*.ts"
fileOutputs -- target / "generated-js" / ** / "*.js"
minifyGeneratedJS = taskKey[Path]("minifies the js files generated by compileTypescript to a single combined js file.")
dependsOn: compileTypeScript / fileOutputs
Given a clean build, the following should happen
> minifyGeneratedJS
// compileTypescript is run
// minifyGeneratedJS is run
> minifyGeneratedJS
// no op because nothing changed
> minifyGeneratedJS / clean
// removes the file returned by minifyGeneratedJS.previous
> minifyGeneratedJS
// re-runs minifyGeneratedJS with the previously compiled js artifacts
> compileTypescript / clean
// removes the generated js files
> minifyGeneratedJS
// compileTypescript is run because the previous clean removed the generated js files
// minifyGeneratedJS runs because the artifacts have changed
> clean
// removes the generated js files and the minified js file
> minifyGeneratedJS
// compileTypescript is run because the generated js files were
// minifyGeneratedJS is run both because it was removed and
Moreover, if compileTypescript fails, we want minifyGeneratedJS to fail
as well.
This commit makes this all possible. It adds a number of tasks to
sbt.nio.Keys that deal with the output files. When injecting settings, I
now identify all tasks that return Seq[File], File, Seq[Path] and Path
and create a hidden special task: dynamicFileOutputs: TaskKey[Seq[Path]]
This special task runs the underlying task and converts the result to
Seq[Path]. From there, we can have the tasks like changedOutputPaths
delegate to dynamicFileOutputs which, by proxy, runs the underlying
task. If any task in the input / output chain fails, the entire sequence
fails.
Unlike the fileInputs, we do not register the dynamicFileOutputs or
fileOutputs with continuous watch service so these paths will not
trigger a continuous build if they are modified. Only explicit unmanaged
input sources should should do that.
As part of this, I also added automatic generation of a custom clean task for
any task that returns Seq[File], File, Seq[Path] or Path. I also added
aggregation so that clean can be defined in a configuration or project
and it will automatically run clean for all of the tasks that have a
custom clean implementation in that task or project. The automatic clean
task will only delete files that are in the task target directory to
avoid accidentally deleting unmanaged files.
This commit refactors things so that the nio apis are located primarily
in the nio package. Because the nio keys are a first class sbt feature,
I had to add import sbt.nio._ and sbt.nio.Keys._ to the autoimports in
BuildUtil.scala
I had some ideas for allowing the user to get a copy of system in during
a continuous build but I can't really see a good use case now so I'm
going to remove it before 1.3.0.
In my recent changes to watch, I have been moving towards a world in
which sbt manages the file inputs and outputs at the task level. The
main idea is that we want to enable a user to specify the inputs and
outputs of a task and have sbt able to track those inputs across
multiple task evaluations. Sbt should be able to automatically trigger a
build when the inputs change and it also should be able to avoid task
evaluation if non of the inputs have changed.
The former case of having sbt automatically watch the file inputs of a
task has been present since watch was refactored. In this commit, I
make it possible for the user to retrieve the lists of new, modified and
deleted files. The user can then avoid task evaluation if none of the
inputs have changed.
To implement this, I inject a number of new settings during project
load if the fileInputs setting is defined for a task. The injected
settings are:
allPathsAndAttributes -- this retrieves all of the paths described by
the fileInputs for the task along with their attributes
fileStamps -- this retrieves all of the file stamps for the files
returned by allPathsAndAttributes
Using these two injected tasks, I also inject a number of derived tasks,
such as allFiles, which returns all of the regular files returned by
allPathsAndAttributes and changedFiles, which returns all of the regular
files that have been modified since the last run.
Using these injected settings, the user is able to write tasks that
avoid evaluation if the inputs haven't changed.
foo / fileInputs += baseDirectory.value.toGlob / ** / "*.scala"
foo := {
foo.previous match {
case Some(p) if (foo / changedFiles).value.isEmpty => p
case _ => fooImpl((foo / allFiles).value
}
}
To make this whole mechanism work, I add a private task key:
val fileAttributeMap = taskKey[java.util.HashMap[Path, Stamp]]("...")
This keeps track of the stamps for all of the files that are managed by
sbt. The fileStamps task will first look for the stamp in the attribute
map and, only if it is not present, it will update the cache. This
allows us to ensure that a given file will only be stamped once per task
evaluation run no matter how the file inputs are specified. Moreover, in
a continuous build, I'm able to reuse the attribute map which can
significantly reduce latency because the default file stamping
implementation used by zinc is fairly expensive (it can take anywhere
between 300-1500ms to stamp 5000 8kb source files on my mac).
I also renamed some of the watch related keys to be a bit more clear.
This trait may not even survive until 1.4.0. It should definitely not be
public. I got a little overexcited about programming with higher kinded
types when I added it.
The newest version of io repackages a number of classes into the
sbt.nio.* packages. It also changes some of the semantics of glob
related apis. This commit updates all of the usages of the updated apis
within sbt but should have no functional difference.
Since the new watch implementation has yet to be widely deployed, we
should hold off on deprecating the old keys. They could still be
deprecated in a patch release or in 1.4.0.