Fixes#4574
This defines the `classLoaderLayeringStrategy` key at `Global` and `Zero / Test` level, and uses the scope delegation to pick them out from `test`.
I realized that Stamped.File was a bad interface that was really just an
implementation detail of external hooks. I updated the
GlobLister.{ all, unique } methods to return Seq[(Path, FileAttributes)]
rather than Stamped.File which is a much more natural api and one I
could see surviving the switch to nio based apis planned for
1.4.0/2.0.0. I also added a simple scripted test for glob listing. The
GlobLister.all method is implicitly tested all over the place since the
compile task uses it, but it's good to have an explicit test.
The caching repository does not work universally so set the default to
always poll. This is still faster than in sbt 1.2.x because of
performance improvements that I added for listing directories.
I decided that FileCacheEntry was a bad name because the methods did not
necessarily have anything to do with caching. Moreover, because it is
exposed in a public interface, it shouldn't be in the internal package.
Rather than exposing the FileEventMonitor.Event types, which are under
active development in the io repo, I am adding a new event trait to
FileCacheEntry. This trait doesn't expose any internal implementation
details.
Windows io really doesn't handle concurrent readers and writers all that
well. Using the LegacyFileTreeRepository was problematic in windows
scripted tests because even though the repository implementation did not
use the cache in its list methods, it did persistently monitor the
directories that were registered. The monitor has to do a lot of
io on a background thread to maintain the cache. This caused io
contention that would cause IO.createDirectory to fail with an obscure
AccessDeniedException. The way to avoid this is to prevent the
background io from occurring at all.
I don't necessarily think this will impact most users running sbt
interactively with a cache, but it did cause scripted tests to fail. For
that reason I made the default in non-interactive/shell use cases on
windows to be a PollingFileRepository which never monitors the file
system except when we are in a watch. The LegacyFileTreeRepository works
fine on mac and linux which have a more forgiving file system.
To make this work, I had to add FileManagement.toMonitoringRepository.
There are now two kinds of repositories that cannot monitor on their
own: HybridPollingFileTreeRepository and PollingFileRepository. The
FileManagement.toMonitoringRepository makes a new repository that turns
on monitoring for those two repository types and disables the close
method on all other repositories so that closing the FileEventMonitor
does not actually close the global file repository.
I ran into a couple of issues with the clean implementation. I changed
the logging to print to stdout instead of streams if enabled. I also
added a helper, Clean.deleteContents that recursively deletes all of the
contents of a directory except for those that match the exclude filter
parameter.
Using a normal logger was a bad idea because we are actually deleting
the target/streams directory when running clean.
The previous implementation worked by getting the full list of files to
delete, reverse sorting it and then deleting every element in the list.
While this can work well it many circumstances, if the directory is
still being written to during the recursive deletion, then we could miss
files that were added after we fetched all of the files. The new version
lazily lists the subdirectories as needed.
The Defaults.scala file has a lot going on. I am trying to generally
follow the pattern of implementing the default task implementation in a
different file and just adding the appropriate declarations in
Defaults.scala.
This rewroks the cleanTask so that it only removes a subset of the
files in the target directory. To do this, I add a new task, outputs,
that returns the glob representation of the possible output files for
the task. It must be a task because some outputs will depend on streams.
For each project, the default outputs are all of the files in
baseDirectory / target.
Long term, we could enhance the clean task to be automatically generated
in any scope (as an input task). We could then add the option for the
task scoped clean to delete all of the transitive outputs of the class.
That is beyond the scope of this commit, however.
I copied the scripted tests from #3678 and added an additional test to
make sure that the manage source directory was explicitly cleaned.
The clean task is unreasonably slow because it does a lot of redundant
io. In this commit, I update clean to be implemented using globs. This
allows us to (optionally) route io through the file system cache. There
is a significant performance improvement to this change. Currently,
running clean on the sbt project takes O(6 seconds) on my machine. After
this change, it takes O(1 second).
To implement this, I added a new setting cleanKeepGlobs to replace
cleanKeepFiles. I don't think that cleanKeepFiles returning Seq[File] is
a big deal for performance because, by default, it just contains the
history file so there isn't much benefit to accessing a single file
through the cache. The reason I added the setting was more for
consistency and to help push people towards globs in their own task
implementations.
Part of the performance improvement comes from inverting the problem.
Before we would walk the file system tree from the base and recursively
delete leafs and nodes in a depth first traversal. Now we collect all of
the files that we are interested in deleting in advance. We then sort
the results lexically by path name and then perform the deletions in
that order. Because children will always comes first in this scheme,
this will generally allow us to delete a directory.
There is an edge case that if files are created in a subdirectory after
we've created the list to delete, but before the subdirectory is
deleted, then that subdirectory will not be deleted. In general, this
will tend to impact target/streams because writes occur to
target/streams during traversal. I don't think this really matters for
most users. If the target directory is being concurrently modified with
clean, then the user is doing something wrong.
To ensure legacy compatibility, I re-implement cleanKeepFiles to return
no files. Any plugin that was appending files to the cleanKeepFiles task
with `+=` or `++=` will continue working as before because I explicitly
add those files to the list to delete. I updated the actions/clean-keep
scripted test to use both cleanKeepFiles and cleanKeepGlobs to ensure
both tasks are correctly used.
Bonus: add debug logging of all deleted files
Right now, the sbt.internal.io.Source is something of a second class
citizen within sbt. Since sbt 0.13, there have been extension classes
defined that can convert a file to a PathFinder but no analog has been
introduced for sbt.internal.io.Source.
Given that sbt.internal.io.Source was not really intended to be part of
the public api (just look at its package), I think it makes sense to
just replace it with Glob. In this commit, I add extension
methods to Glob and Seq[Glob] that make it possible to easily
retrieve all of the files for a particular Glob within a task. The
upshot is that where previously, we'd have had to write something like:
watchSources += Source(baseDirectory.value / "src" / "main" / "proto", "*.proto", NothingFilter)
now we can write
watchGlobs += baseDirectory.value / "src" / "main" / "proto" * "*.proto"
Moreover, within a task, we can now do something like:
foo := {
val allWatchGlobs: Seq[File] = watchGlobs.value.all
println(allWatchSources.mkString("all watch source files:\n", "\n", ""))
}
Before we would have had to manually retrieve the files.
The implementation of the dsl uses the new GlobExtractor class which
proxies file look ups through a FileTree.Repository. This makes it so
that, by default, all file i/o using Sources will use the default
FileTree.Repository. The default is a macro that returns
`sbt.Keys.fileTreeRepository.value: @sbtUnchecked`. By doing it this
way, the default repository can only be used within a task definition
(since it delegates to `fileTreeRepository.value`). It does not,
however, prevent the user from explicitly providing a
FileTree.Repository instance which the user is free to instantiate
however they wish.
Bonus: optimize imports in Def.scala and Defaults.scala
The FileTreeViewConfig abstraction that I added was somewhat unwieldy
and confusing. The original intention was to provide users with a lot of
flexibility in configuring the global file tree repository used by sbt.
I don't think that flexibility is necessary and it was both conceptually
complicated and made the implementation complex. In this commit, I add a
new boolean flag enableGlobalCachingFileTreeRepository that toggles
which kind of FileTreeRepository to use globally.
There are actually three kinds of repositories that could be returned:
1) FileTreeRepository.default -- this caches the entire file system
tree it hooks into the cache's event callbacks to create a file event
monitor. It will be used if enableGlobalCachingFileTreeRepository is
true and Global / pollingGlobs := Nil
2) FileTreeRepository.hybrid -- similar to FileTreeRepository.default
except that it will not cache any files that are included in
Global / pollingGlobs. It will be used if
enableGlobalCachingFileTreeRepository is true and
Global / pollingGlobs is non empty
3) FileTreeRepository.legacy -- does not cache any of the file system
tree, but does maintain a persistent file monitoring process that is
implemented with a WatchServiceBackedObservable. Because it doesn't
poll, in general, it's ok to leave the monitoring on in the
background. One reason to use this is that if there are any issues
with the cache being unable to accurately mirror the underlying file
system tree, this repository will always poll the file system
whenever sbt requests the entries for a given glob. Moreover, the
file system tree implementation is very similar to the implementation
that was used in 1.2.x so this gives users a way to almost fully opt
back in to the old behavior.
This new version of io breaks source and binary compatibility everywhere
that uses the register(path: Path, depth: Int) method that is defined on
a few interfaces because I changed the signature to register(glob:
Glob). I had to convert to using a glob everywhere that register was
called.
I also noticed a number of places where we were calling .asFile on a
file. This is redundant because asFile is an extension method on File
that just returns the underlying file.
Finally, I share the IOSyntax trait from io in AllSyntax. There was more
or less a TODO suggesting this change. The one hairy part is the
existence of the Alternative class. This class has unfortunately somehow
made it into the sbt package object. While I doubt many plugins are
using this, it doesn't seem worth breaking binary compatibility to get
rid of it. The issue is that while Alternative is defined private[sbt],
the alternative method in IOSyntax is public, so I can't get rid of
Alternative without breaking binary compatibility.
I'm not deprecating Alternative for now because the sbtProj still has
xfatal warnings on. I think in many, if not most, cases, the Alternative
class makes the code more confusing as is often the case with custom
operators. The confusion is mitigated if the abstraction is used only in
the file in which it's defined.
Fixes#4461
This opens up ExecuteProgress API that's been around under private[sbt].
Since the state passing mechanism hasn't been used, I got rid of it.
The build user can configure the build using two keys Boolean `taskProgress` and `State => Seq[TaskProgress]` `progressReports`. `useSuperShell` is lightweight key on/off switch for the super shell that can be used as follows:
```scala
Global / SettingKey[Boolean]("useSuperShell") := false
```
I often find that when I run a command it takes a long time to start up
because sbt triggers a full gc. To improve the ux, I update the command
exchange to run full gc only once while it's waiting for a command to
run and only after the user has been idle for at least one minute.
Bonus: optimize imports
Previously, we were leaking the internal details of incremental
compilation to users by defining FileTree(DataView|Repository)[Stamp].
To avoid this, I introduce the new class FileCacheEntry that is quite
similar to Stamp except defined using scala Options rather than java
Optionals. The implementation class just delegates to an actual Stamp
and I provided a private[sbt] ops class that adds a
method `stamp` to FileCacheEntry. This will usually just extract the
stamp from the implementation class. This allows us to use
FileCacheEntry almost interchangeably with Stamp while still avoiding
exposing users to Stamp.
In the FileTreeDataView use case, we were previously working with
FileTreeDataView[Stamped], which actually contained a lot of redundant
information because FileTreeDataView.Entry[_] has a toTypedPath method
that could be used to read the path related fields in Stamped. Instead,
we can just return the Stamp itself in FileTreeDataView.list* methods
and convert to Stamped.File where needed (i.e. in ExternalHooks).
Also move BasicKeys.globalFileTreeView to Keys since it isn't actually
used in the main-command project.
Resident compilation actually works pretty well most of the time*,
but if there ever is an issue with the cached compilation, we should be
able to easily clear the cache.
* I've only had issues when package objects are involved
If the managedSources task writes into an unmanaged source directory,
that would cause an infinite loop. I don't think it's worth doing out of
band task execution to try and prevent this.
The community build was broken for some projects because I broke builds
that relied on the unscoped definition of `runner`. To preserve legacy
behavior, I restore the old unscoped behavior and append the new scoped
runners that used the layered classloaders. This makes more sense
because the layered classloaders were specifically designed for the
Runtime and Test configurations and may not make sense in other
contexts.
I noticed that sometimes when running scripted tests that I'd run out of
metaspace. I believe that this may be due to the caffeine cache leaking
classloaders. Regardless, it's a good idea to clear the cache whenever
we shutdown the command exchange or reload the state.
Previously, the ClassLoaderLayeringStrategy was set globally. This
didn't really make sense because the Runtime and Test configs had
different strategies available (Test being a superset of Runtime).
Instead, we now set the layering strategy in the Runtime and Test
configurations directly. In doing this, we can eliminate the Default
ClassLoaderLayeringStrategy. Previously this had existed so that we
could set the layering strategy globally and have it do the right thing
in both test and runtime.
To implement this, I factored out the logic for generating the layered
classloader in the test task and shared it with the runtime task. I did
this because I realized that Test / run is a thing. Previously I had
been operating under the assumption that the runner would never include
the test dependencies. Once I realized this, it made sense to combine
the logic in both tasks.
As a bonus, I only allow the layering strategies that explicitly make
sense to be set in each configuration. If the user sets an invalid
strategy, an error will be thrown that specifies the valid strategies
for the task.
I also added ScalaInstance as an option for the runtime layer. It was an
oversight that this was left out.
In code review, @eed3si9n suggested that I switch to a more verbose and
descriptive naming scheme. In addition to trying to make layers more
descriptive, I also made the various layer case objects extend the
relevant layers so it's more clear what the layer should look like.
I was seeing spurious travis failures and I finally tracked it down to
the fact that in some cases, the project metabuild would sometimes use a
caching file tree repository instead of a polling repository. This
caused problems because the caching repository can take a few milliseconds
to detect changes in a directory. Because scripted copies the project
sources to the temporary test directory, it was possible for the project
meta build compilation to be initiated before the cache was aware of all
of the files.
The reason this happened was because scripted would create a state where
the remaining commands looked like:
List(sbtPopOnFailure, resumeFromFailure, notifyUsersAboutShell, iflast shell, ~compile, < 41684)
The ~compile command was causing the continuous flag to get set to true
which caused the default file tree repository task to return the caching
version. The reason for the continuous flag was so that when sbt is
started in a non-interactive mode where the command is to be repeated,
then we use the caching file tree repository. To support this use case,
we just need to check that the last command begins with `~`, not that
_any_ command begins with `~`.
We want the user to be able to invalidate the classloader cache in the
event that it somehow gets in a bad state. The cache is, however,
defined in multiple configurations, so there are in fact many
ClassLoaderCache instances that are managed by sbt. To make this sane, I
add a global cache that is keyed by a TaskKey[_] and can return
arbitrary data back. Invalidating all of the ClassLoaderCache instances
is then as straightforward as just replacing the TaskRepository
instance.
I also went ahead and unified the management of the global file tree
repository. Instead of having to specifically clear the file tree
repository or the classloader cache, the user can now invalidate both
with the new clearCaches command.
Using the data structures that I added in the previous commits, it is
now possible to rework the run and test task to use (configurable)
layered class loaders. The layering strategy is globally set to
LayeringStrategy.Default. The default strategy leads to what is
effectively a three layered ClassLoader for the both the test and run
tasks. The first layer contains the scala instance (and test framework
loader in the test task). The second layer contains all of the
dependencies for the configuration while the third layer contains the
project artifacts.
The layering strategy is very easily changed both at the Global or
Configuration level, e.g. adding
Test / layeringStrategy := LayeringStrategy.Flat
to the project build.sbt will make the test task not even use the scala
instance and instead a create a single layer containing the full
classpath of the test task.
I also tried to ensure that all of the ClassLoaders have good toString
overrides so that it's easy to see how the ClassLoader is constructed
with, e.g. `show testLoader`, in the sbt console.
In this commit, the ClassLoaderCache instances are settings. In the next
commit, I make them tasks so that we can easily clear out the caches
with a command.
This introduces a new trait LayeringStrategy that is used to configure
how sbt constructs the ClassLoaders used by the run and test tasks. In
addition to defining the various options, I try to give a good high
level overview of the problem that the LayeringStrategy is intended to
address in its scaladoc.
In order to speed up the start up time of the test and run tasks, I'm
introducing a ClassLoaderCache that can be used to avoid reloading the
classes in the project dependencies (which includes the scala library).
I made the api as minimal as possible so that we can iterate on the
implementation without breaking binary compatibility. This feature will
be gated on a feature flag, so I'm not concerned with the cache class
loaders being useable in every user configuration. Over time, I hope
that the CachedClassLoaders will be a drop in replacement for the
existing one-off class loaders*.
The LayeredClassLoader was adapted from the NativeCopyLoader. The main
difference is that the NativeCopyLoader extracts named shared libraries
into the task temp directory to ensure that the ephemeral libraries are
deleted after each task run. This is a problem if we are caching the
ClassLoader so for LayeredClassLoader I track the native libraries that
are extracted by the loader and I delete them either when the loader is
explicitly closed or in a shutdown hook.
* This of course means that we both must layer the class loaders
appropriately so that the project code is in a layer above the cached
loaders and we must correctly invalidate the cache when the project, or
its dependencies are updated.
I am going to be introducing multiple caches throughout sbt and I am
going to build these features out using this simple Repository
interface. The idea is that we access data by some key through the
Repository. This allows us to use the strategy pattern to easily switch
the runtime implementation of how to get the data.
I am going to add a classloader cache to improve the startup performance
of the run and test tasks. To prevent the classloader cache from having
unbounded size, I'm adding a simple LRUCache implementation to sbt. An
important characteristic of the implementation of the cache is that when
entries are evicted, we run a callback to cleanup the entry. This allows
us to automatically cleanup any resources created by the entry.
This is a pretty naive implementation that uses an array of entries that
it manipulates as elements are removed/accessed. In general, I expect
these caches to be quite small <= 4 elements, so the storage overhead /
performance of the simple implementation should be quite good. If
performance ever becomes an issue, we can specialzed LRUCache.apply to
use a different implementation for caches with large limits.
The `sbt-server` was prepending a new probem and not appending.
The result was a `textDocument/publishDiagnostics` notification
containing a inverted list of problems compare to what was show in the
sbt console.
It drives me crazy that in intellij when I do the go to class task that
TestBuild.Keys comes up before Keys. Given how central Keys is to sbt,
it doesn't seem like a good idea to alias that particular class name.
It was becoming a pain to work on these files in intellij because the
auto-import feature would implicitly optimize all of the imports in
these files, leading to a large diff. I'd then have to go and manually
add the import that I care about. This change does add some wildcard
imports, which I don't always love, but these files are so unwieldy
already that I think it's worth it to have the imports follow the format
preferred by intellij.
It has long been a frustration of mine that it is necessary to prepend
multiple commands with a ';'. In this commit, I relax that restriction.
I had to reorder the command definitions so that multi comes before act.
This was because if the multi command did not have a leading semicolon,
then it would be handled by the action parser before the multi command
parser had a shot at it. Sadness ensued.
In #4446, @japgolly reported that in some projects, if a parent project
was broken, then '~' would immediately exit upon startup. I tracked it
down to this managed sources filter. The idea of this filter is to avoid
getting stuck in a build loop if managedSources writes into an unmanaged
source directory. If the (managedSources in ThisScope).value line
failed, however, it would cause the watchSources, and by delegation,
watchTransitiveSources task to fail. The fix is to only create this
filter if the managedSources task succeeds.
I'm not 100% sure if we shouldn't just get rid of this filter entirely
and just document that '~' will probably loop if a build writes the
result of managedSources into an unmanaged source directory.
Fixes#4437
Until now, sbt was resolved twice once by the launcher, and the second time by the metabuild.
This excludes sbt from the metabuild graph, and instead uses app classpath from the launcher.
Fixes#3436
This implements isMetaBuild setting that is explicitly for meta build only,
unlike sbtPlugin setting which can be used for both meta build and plugin development purpose.
I noticed that when using the latest nightly, triggered execution would
fail to work if I switched projects with, e.g. ++2.10.7. This was
because the background thread that filled the file cache was incorrectly shutdown.
To fix this, we just need to close whatever view is cached in the
globalFileTreeView attribute in the exit hook rather than the view
created by the method.
After making this change and publishing a local SNAPSHOT build, I was
able to switch projects with ++ and have triggeredExecution continue to
work.
On windows* it was possible to get into a loop where the build would
continually restart because for some reason the build.sbt file would get
touched during test (I did not see this behavior on osx). Thankfully,
the repository keeps track of the file hash and when we detect that the
build file has been updated, we check the file hash to see if it
actually changed.
Note that had this bug shipped, it would have been fixable by overriding
the watchOnEvent task in user builds.
The loop would occur if I ran ~filesJVM/test in
https://github.com/swoval/swoval. It would not occur if I ran
test:compile, so the fact that the build file is being touched seems
to be related to the test run itself.
It was possible that on startup, when this function was first invoked,
that the default boot commands are present. This was a problem because
the global file repository is instantiated using the value of this task.
When we start a continuous build, this task gets run again to evaluate
again.
When sbt is started without an implicit task list, then the task is
implicitly shell as indicated by the command "iflast shell". We can use
this to determine whether or not to use the global file system cache or
not.
Ideally we use the FileTreeRepository for interactive sessions by
default. A continuous build is effectively interactive, so I'd like that
case to also use the file tree repository. To avoid breaking scripted
tests, many of which implicitly expect file tree changes to be
instantaneously available, we set interactive to true only if we are not
in a scripted run, which can be verified by checking that the commands
contains "setUpScripted".
Sometimes a user may want to rerun their task even if the source files
haven't changed. Presently this is a little annoying because you have to
hit enter to stop the build and then up arrow or <ctrl+r> plus enter to
rebuild. It's more convenient to just be able to press the 'r' key to
re-run the task.
To implement this, I had to make the watch task set up a jline terminal
so that System.in would be character buffered instead of line buffered.
Furthermore, I took advantage of the NonBlockingInputStream
implementation provided by jline to wrap System.in. This was necessary
because even with the jline terminal, System.in.available doesn't return
> 0 until a newline character is entered. Instead, the
NonBlockingInputStream does provide a peek api with a timeout that will
return the next unread key off of System.in if there is one available.
This can be use to proxy available in the WrappedNonBlockingInputStream.
To ensure maximum user flexibility, I also update the watchHandleInput Key to
take an InputStream and return an Action. This setting will now receive
the wrapped System.in, which will allow the user to create their own
keybindings for watch actions without needing to use jline themselves.
Future work might make it more straightforward to go back to a line
buffered input if that is what the user desires.
For projects with a large number of files, zinc has to do a lot of work
to determine which source files and binaries have changes since the last
build. In a very simple project with 5000 source files, it takes roughly
750ms to do a no-op compile using the default incremental compiler
options. After this change, it takes about 200ms. Of those 200ms, 50ms
are due to the update task, which does a partial project resolution*.
The implementation is straightforward since zinc already provides an api
for overriding the built in change detection strategy. In a previous
commit, I updated the sources task to return StampedFile rather than
regular java.io.File instances. To compute all of the source file
stamps, we simply list the sources and if the source is in fact an
instance of StampedFile, we don't need to compute it, otherwise we
generate a StampedFile on the fly. After building a map of stamped files
for both the sources files and all of the binary dependencies, we simply
diff these maps with the previous results in the changedSources,
changedBinaries and removedProducts methods.
The new ExternalHooks are easily disabled by setting
`externalHooks := _ => None`
in the project build.
In the future, I could see moving ExternalHooks into the zinc project so
that other tools like bloop or mill could use them.
* I think this delay could be eliminated by caching the UpdateResult so
long as the project doesn't depend on any snapshot libraries. For a
project with a single source, the no-op compile takes O(50ms) so caching
the project resolution would make compilation start nearly
instantaneous.
I realized that using the cache has the potential to cause issues for
batch processing in CI if some tasks assume that a file created by one
task will immediately be visible in the other. With the cache, there is
typically on O(10ms) latency between a file being created and appearing
in the cache (at least on OSX). When manually running commands, that
latency doesn't matter.
It is not always possible to monitor a directory using OS file system
events. For example, inotify does not work with nfs. To work around
this, I add support for a hybrid FileTreeViewConfig that caches a
portion of the file system and monitors it with os file system
notification, but that polls a subset of the directories. When we query
the view using list or listEntries, we will actually query the file
system for the polling directories while we will read from the cache for
the remainder. When we are not in a continuous build (~ *), there is no
polling of the pollingDirectories but the cache will continue to update
the regular directories in the background. When we are in a continuous
build, we use a PollingWatchService to poll the pollingDirectories and
continue to use the regular repository callbacks for the other
directories.
I suspect that #4179 may be resolved by adding the directories for which
monitoring is not working to the pollingDirectories task.
Now that we have the fileTreeView task, we can generalized the process
of collecting files from the view (which may or may not actually cache
the underlying file tree). I moved the implementation of collectFiles
and addBaseSources into the new FileManagement object because Defaults
is already too large of a file. When we query the view, we also need to
register the directory we're listing because if the underlying view is a
cache, we must call register before any entries will be available.
Because FileTreeDataView doesn't have a register method, I implement
registration with a simple implicit class that pattern matches on the
underlying type and only calls register if it is actually a
FileRepository.
A side effect of this change is that the underlying files returned by
collectFiles and appendBaseSources are StampedFile instances. This is so
that in a subsequent commit, I can add a Zinc external hook that will
read these stamps from the files in the source input array rather than
compute the stamp on the fly. This leads to a substantial reduction in
Zinc startup time for projects with many source files. The file filters
also may be applied more quickly because the isDirectory property (which
we check for all source files) is read from a cached value rather than
requiring a stat.
I had to update a few of the scripted tests to use the `1.2.0`
FileTreeViewConfig because those tests would copy a file and then
immediately re-compile. The latency of cache invalidation is O(1-10ms),
but not instantaneous so it's necessary to either use a non-caching
FileTreeView or add a sleep between updates and compilation. I chose the
former.
Every time that the compile task is run, there are potentially a large
number of iops that must occur in order for sbt to generate the source
file list as well as for zinc to check which files have changed since
the last build. This can lead to a noticeable delay between when a build
is started (either manually or by triggered execution) and when
compilation actually begins. To reduce this latency, I am adding a
global view of the file system that will be stored in
BasicKeys.globalFileTreeView.
To make this work, I introduce the StampedFile trait, which augments the
java.io.File class with a stamp method that returns the zinc stamp for
the file. For source files, this will be a hash of the file, while for
binaries, it is just the last modified time. In order to gain access to
the sbt.internal.inc.Stamper class, I had to append addSbtZinc to the
commandProj configurations.
This view may or may not use an in-memory cache of the file system tree
to return the results. Because there is always the risk of the cache
getting out of sync with the actual file system, I both make it optional
to use a cache and provide a mechanism for flushing the cache. Moreover,
the in-memory cache implementation in sbt.io, which is backed by a
swoval FileTreeRepository, has the property that touching a monitored
directory invalidates the entire directory within the cache, so the
flush command isn't even strictly needed in general.
Because caching is optional, the global is of a FileTreeDataView, which
doesn't specify a caching strategy. Subsequent commits will make use of
this to potentially speed up incremental compilation by caching the
Stamps of the source files so that zinc does not need to compute the
hashes itself and will allow for continuous builds to use the cache to
monitor events instead of creating a new, standalone FileEventMonitor.
There may be instances where the user may wish to stop the watch if an
error occurs running the task. To facilitate this, I add boolean
parameter, lastStatus, to watchShouldTerminate. The value is computed by
modifying the state used to run the task to have a custom onFailure
command. If the task fails, the returned state will have the onFailure
command will be enqueued at the head of the remaining commands. The
result of the task then becomes true if the custom onFailure is not
present in the remaining commands and false if it is. We don't actually
run this command, so it's just implemented with the identity function.
I also updated Watched.watch to return an Action instead of Unit. This
enables us to return a failed state if Watched.watch returns
HandleError.
This commit reworks Watched to be more testable and extensible. It also
adds some small features. The previous implementation presented a number
of challenges:
1) It relied on external side effects to terminate the watch, which was
difficult to test
2) It exposed irrelevant implementation details to the user in the
methods that exposed the WatchState as a parameter.
3) It spun up two worker threads. One was to monitor System.in for user
input. The other was to poll the watch service for events and write
them to a queue. The user input thread actually broke '~console'
because nearly every console session will hit the <enter> key, which
would eventually cause the watch to stop when the user exited the
console.
To address (1), I add the shouldTerminate method to WatchConfig. This
takes the current watch iteration is input and if the function returns
true, the watch will stop.
To address (2), I replace the triggeredMessage and watchingMessage keys
with watchTriggeredMessage and watchStartMessage. The latter two keys
are functions that do not take the WatchState as parameters. Both
functions take the current iteration count as a parameter and the
watchTriggeredMessage also has a parameter for the path that triggered
the build.
To address (3), I stop using the sbt.internal.io.EventMonitor and
instead use the sbt.io.FileEventMonitor. The latter class is similar to
the former except that it's polling method accepts a duration, which may
be finite or infinite) and returns all of the events that occurred since
it was last polled. By adding the ability to poll for a finite amount of
time, we can interleave polling for events with polling System.in for
user input, all on the main thread. This eliminates the two extraneous
threads and fixes the '~console' use case I described before.
I also let the user configure the function that reads from System.in via
the watchHandleInput method. In fact, this method need not read from
System.in at all since it's just () => Watched.Action. The reason that
it isn't () => Boolean is that I'd like to leave open the option for the
ability to trigger a build via user input, not just terminating the
watch. My initial idea was to add the ability to type 'r' to re-build in
addition to <enter> to exit. This doesn't work without integrating
jline though because the input is buffered. Regardless, for testing
purposes, it gives us the ability to add a timeout to the watch by
making handleInput return true when a deadline expires.
The tests are a bit wonky because I still need to rely on side effects
in the logging methods to orchestrate the sequence of file events that
I'd like to test. While I could move some of this logic into a
background thread, there still needs to be coordination between the
state of the watch and the background thread. I think it's easier to
reason about when all of the work occurs on the same thread, even if it
makes these user provided functions impure.
I deprecated all of the previous watch related keys that are no longer
used with the new infrastructure. To avoid breaking existing builds, I
make the watchConfig task use the deprecated logging methods if they are
defined in the user's builds, but sbt will not longer set the default
values. For the vast majority of users, it should be straightforward to
migrate their builds to use the new keys. My hunch is that the of the
deprecated keys, only triggeredMessage is widely used (in conjunction
with the clear screen method) and it is dead simple to replace it with
watchTriggeredMessage.
Note: The FileTreeViewConfig class is not really necessary for this commit.
It will become more important in a subsequent commit which introduces an
optional global file system cache.
This commit makes watch event logging work in the '~' command. The
previous design of the command made this difficult, so there is a
significant re-design of the implementation of '~'. I believe that this
redesign will allow the feature to be maintained and improved more
easily moving forward. With the redesign, it is now possible to test the
business logic of the watch command (and I add a rudimentary test that I
will build upon in subsequent commits).
A bonus of this redesign is that now if the user tries to watch an
invalid command, the watch will immediately terminate with an error
rather than get stuck waiting for events when the task can never
possibly succeed.
The previous implementation of the '~' command makes it difficult
to dynamically control the implementation arguments because it is
implemented in the command project which makes it unable to depend on
any task keys that are defined in the build. It works around this by
putting all of it's configuration in the Watched attribute which is
stored globally. This would not have been necessary if the function had
been defined in the main project where it could just extract the value
of the watched task rather than relying on the global attribute value.
Moreover, because it cannot depend on tasks, it makes it nigh impossible
to use the logging framework within the '~' command.
Another issue with the previous implementation is that it's somewhat
difficult to reason about. The executeContinuously has effectively two
entry points: one for the first time the command is run and one for each
subsequent invocation when a new build is triggered. The successive
invocations are triggered by prepending commands to run to the previous
state. This is made recursive by prepending the initial command (that
was prefixed with '~'. Which branch we're in is determined by checking
for the existence of a temporary attribute, that we must ensure that we
remove when the build is stopped. This makes a lot of behavior non-local and
difficult for an outsider who is less familiar with sbt to understand.
Broadly, this refactor does two things:
1) Move the definition of continuous from BasicCommands to BuiltInCommands
2) Re-work the implementation to be executed in code rather than using
the sbt dsl.
The first part is simple. We just add an implementation of continuous to
BuiltInCommands and remove it from the list of BasicCommands. We need to
leave in the legacy implementation for binary compatibility. I also
moved all of the actual implementation logic into Watched, which makes
maintenance easier since most of the logic is in one place.
The second part is more complicated. Rather than rely on the sbt dsl
(e.g. `(ClearOnFailure :: next :: FailureWall :: repeat :: s)`) to
parse and run the command. We manually parse the command and generate a
task of type `() => State`. We don't actually need to do anything with
the generated state because we're going to return the original state at
the end of the command no matter what. With this task, we can then
create a tail recursive function that repeatedly executes the task until
the watch is terminated.
The parsing is handled in the Watch.command method (which is where I
moved the refactored BasicCommands.continuous implementation). The
actual task running and monitoring is handled in Watched.watch. This
method has no reference to the sbt state, which makes it testable. It sets
up an event monitor and then delegates the recursive monitoring to a
small nested function, Watched.watch.impl. One nice thing about this
approach is that it is very easy to reason about the life cycle of the
EventMonitor. The recursive call is within a try { } finally { } where
the monitor and stdin are guaranteed to be cleared at the end.
Adding support for a custom (and default) watch logger is trivial with
the new infrastructure and is done via the watchLogger TaskKey.
There was a small reporting race condition that was introduced by the
change to (2). Because the new implementation is able to bypass command
parsing for triggered builds, the watch message would usually end up
being printed before the task outcome was fully logged. To work around
this, I made the watch and triggered messages be logged rather than
printed directly to stdout. As a result, the only user visible result of
this change should be that instead of seeing:
"1. Waiting for source changes in project foo... (press enter to interrupt)",
users will now see:
"[info] 1. Waiting for source changes in project foo... (press enter to interrupt)".
There have been reports that often a new build will be triggered
immediately after the previous build even when none of the files have
been modified since the start of the last build. This can happen when,
for example, a program implements save with a rename. When that occurs,
a deletion watch event may trigger the build but the corresponding
creation event may be detected outside of the current 40ms window. By
bumping this value to 500ms, we hopefully prevent the majority of these
false triggers. For unusual workflows in which this longer quarantine
period is an issue, the setting can be overridden.
This change makes the temporary shared library that is created by
the swoval file-tree-views library to be extracted into the sbt global
base directory rather than the temp file. This way if there is a leak of
shared libraries, they can easily be found in ~/.sbt rather than in,
say, /tmp (or the osx/windows equivalent location). The extracted shared
library objects will be in the path ~/.sbt/swoval-jni. There is a
shutdown hook that removes them as well as a garbage collection process
that runs in the background whenever the swoval library is loaded, so
these shouldn't leak uncontrollably.
I had to turn off -Xfatal-warnings in commandProj because after updating
io, commandProj depends on the deprecated EventMonitor class. In #4335,
I stop using EventMonitor, but deprecate the Watched class which is both
defined and used (as an unused attribute key) in commandProj. I think we
can probably get rid of Watched in 1.4.x and certainly in a hypothetical
2.x, so hopefully we can restore -Xfatal-warnings sooner than later.
I also had to replace uses of IO.classLocationFile with
IO.classLocationPath to avoid compilation failures due to
-Xfatal-warnings.
Fixes#4362
This implements an instance of ExecuteProgress that is enabled by default to report progress on the shell.
Combined with the scroll up logger, this takes over the bottom lines of the terminal screen and display a count up clock of the currently executing task.
The end goal is to rewrite Dotty's compiler-bridge in Java (this is easy
since the zinc-specific phases are in the compiler itself) to simplify
the bootstrapping process.
Change CommandExchange#publishEvent to not broadcast ExecStatusEvent and
instead check the channelName, this matches what was already done in
CommandExchange#publishEventMessage
Fixes#4293
Ref #4231, #4065
This fixes the regression on sbt 1.2.0 that displays a lot of warnings about configurations.
The warning was added in #4231 in an attempt to fix#4065. This actually highlights somewhat loose usage of configurations among the builds in the wild, and the limitation on the current slash syntax implementation.
I think we can remove this warning for now, and try to fix#4065 by making slash syntax more robust. In particular, we need to memorize the mapping between the configuration name and Scala identifier across the entire build, and use that in the shell.
Ref https://github.com/sbt/sbt/issues/4121
sbt already has the facility to trim stack traces. This sets the trace level of the background run, which fixes the upper half of the `run` stacktrace.
```
[error] (run-main-0) java.lang.Exception
[error] java.lang.Exception
[error] at Hello$.delayedEndpoint$Hello$1(Hello.scala:5)
[error] at Hello$delayedInit$body.apply(Hello.scala:1)
[error] at scala.Function0.apply$mcV$sp(Function0.scala:34)
[error] at scala.Function0.apply$mcV$sp$(Function0.scala:34)
[error] at scala.runtime.AbstractFunction0.apply$mcV$sp(AbstractFunction0.scala:12)
[error] at scala.App.$anonfun$main$1$adapted(App.scala:76)
[error] at scala.collection.immutable.List.foreach(List.scala:389)
[error] at scala.App.main(App.scala:76)
[error] at scala.App.main$(App.scala:74)
[error] at Hello$.main(Hello.scala:1)
[error] at Hello.main(Hello.scala)
[error] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
[error] at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
[error] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
[error] at java.lang.reflect.Method.invoke(Method.java:498)
[error] java.lang.RuntimeException: Nonzero exit code: 1
[error] at sbt.Run$.executeTrapExit(Run.scala:127)
[error] at sbt.Run.run(Run.scala:77)
[error] at sbt.Defaults$.$anonfun$bgRunTask$5(Defaults.scala:1254)
[error] at sbt.Defaults$.$anonfun$bgRunTask$5$adapted(Defaults.scala:1249)
[error] at sbt.internal.BackgroundThreadPool.$anonfun$run$1(DefaultBackgroundJobService.scala:377)
[error] at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:12)
[error] at scala.util.Try$.apply(Try.scala:209)
[error] at sbt.internal.BackgroundThreadPool$BackgroundRunnable.run(DefaultBackgroundJobService.scala:299)
[error] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
[error] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
[error] at java.lang.Thread.run(Thread.java:748)
[error] (Compile / run) Nonzero exit code: 1
```
The bottom half requires a similar fix to the foreground log.
Fixes https://github.com/sbt/sbt/issues/3508
This forks an instance of sbt in the background when it's not running already.
```
$ time sbt -client compile
Getting org.scala-sbt sbt 1.2.0-SNAPSHOT (this may take some time)...
:: retrieving :: org.scala-sbt#boot-app
confs: [default]
79 artifacts copied, 0 already retrieved (28214kB/130ms)
[info] entering *experimental* thin client - BEEP WHIRR
[info] server was not detected. starting an instance
[info] waiting for the server...
[info] waiting for the server...
[info] server found
> compile
[success] completed
sbt -client compile 9.25s user 2.39s system 33% cpu 34.893 total
$ time sbt -client compile
[info] entering *experimental* thin client - BEEP WHIRR
> compile
[success] completed
sbt -client compile 3.55s user 1.68s system 107% cpu 4.889 total
```
There's also a special case for aliases that will try to resolve
the target of the alias to a task key if possible and display the
output of that key if found.
see https://github.com/sbt/sbt/issues/2881
Fixes https://github.com/sbt/sbt/issues/1502
This adds `--addPluginSbtFile=<file>` command, which adds the given .sbt file to the plugin build.
Using this mechanism editors or IDEs can start a build with required plugin.
```
$ cat /tmp/extra.sbt
addSbtPlugin("com.eed3si9n" % "sbt-assembly" % "0.14.7")
$ sbt --addPluginSbtFile=/tmp/extra.sbt
...
sbt:helloworld> plugins
In file:/xxxx/hellotest/
...
sbtassembly.AssemblyPlugin: enabled in root
```
I spent a lot of time debugging why it took 5 seconds to run tests each
time. It turns out that if the hostname is not set explicitly on os x,
then getaddrinfo takes 5 seconds to try (and fail) to resolve the dns
entry for the localhostname. This is easily fixed by setting the
hostname, but it is not at all easy to figure out that a slow hostname
lookup is the reason why tests are slow to start.
I don't know if this is a common issue on other platforms, so only issue
the warning on OS X.
It turns out that `syntaxAnalyzer.UnitParser()` in global now also
needs to be synchronized. The alternative is adding `synchronizeNames = true`
in the global constructor, but that already proved unreliable in the
case of #3743 (see comment https://github.com/sbt/sbt/issues/3170#issuecomment-355218833)