Ideally we use the FileTreeRepository for interactive sessions by
default. A continuous build is effectively interactive, so I'd like that
case to also use the file tree repository. To avoid breaking scripted
tests, many of which implicitly expect file tree changes to be
instantaneously available, we set interactive to true only if we are not
in a scripted run, which can be verified by checking that the commands
contains "setUpScripted".
Sometimes a user may want to rerun their task even if the source files
haven't changed. Presently this is a little annoying because you have to
hit enter to stop the build and then up arrow or <ctrl+r> plus enter to
rebuild. It's more convenient to just be able to press the 'r' key to
re-run the task.
To implement this, I had to make the watch task set up a jline terminal
so that System.in would be character buffered instead of line buffered.
Furthermore, I took advantage of the NonBlockingInputStream
implementation provided by jline to wrap System.in. This was necessary
because even with the jline terminal, System.in.available doesn't return
> 0 until a newline character is entered. Instead, the
NonBlockingInputStream does provide a peek api with a timeout that will
return the next unread key off of System.in if there is one available.
This can be use to proxy available in the WrappedNonBlockingInputStream.
To ensure maximum user flexibility, I also update the watchHandleInput Key to
take an InputStream and return an Action. This setting will now receive
the wrapped System.in, which will allow the user to create their own
keybindings for watch actions without needing to use jline themselves.
Future work might make it more straightforward to go back to a line
buffered input if that is what the user desires.
For projects with a large number of files, zinc has to do a lot of work
to determine which source files and binaries have changes since the last
build. In a very simple project with 5000 source files, it takes roughly
750ms to do a no-op compile using the default incremental compiler
options. After this change, it takes about 200ms. Of those 200ms, 50ms
are due to the update task, which does a partial project resolution*.
The implementation is straightforward since zinc already provides an api
for overriding the built in change detection strategy. In a previous
commit, I updated the sources task to return StampedFile rather than
regular java.io.File instances. To compute all of the source file
stamps, we simply list the sources and if the source is in fact an
instance of StampedFile, we don't need to compute it, otherwise we
generate a StampedFile on the fly. After building a map of stamped files
for both the sources files and all of the binary dependencies, we simply
diff these maps with the previous results in the changedSources,
changedBinaries and removedProducts methods.
The new ExternalHooks are easily disabled by setting
`externalHooks := _ => None`
in the project build.
In the future, I could see moving ExternalHooks into the zinc project so
that other tools like bloop or mill could use them.
* I think this delay could be eliminated by caching the UpdateResult so
long as the project doesn't depend on any snapshot libraries. For a
project with a single source, the no-op compile takes O(50ms) so caching
the project resolution would make compilation start nearly
instantaneous.
I realized that using the cache has the potential to cause issues for
batch processing in CI if some tasks assume that a file created by one
task will immediately be visible in the other. With the cache, there is
typically on O(10ms) latency between a file being created and appearing
in the cache (at least on OSX). When manually running commands, that
latency doesn't matter.
It is not always possible to monitor a directory using OS file system
events. For example, inotify does not work with nfs. To work around
this, I add support for a hybrid FileTreeViewConfig that caches a
portion of the file system and monitors it with os file system
notification, but that polls a subset of the directories. When we query
the view using list or listEntries, we will actually query the file
system for the polling directories while we will read from the cache for
the remainder. When we are not in a continuous build (~ *), there is no
polling of the pollingDirectories but the cache will continue to update
the regular directories in the background. When we are in a continuous
build, we use a PollingWatchService to poll the pollingDirectories and
continue to use the regular repository callbacks for the other
directories.
I suspect that #4179 may be resolved by adding the directories for which
monitoring is not working to the pollingDirectories task.
Now that we have the fileTreeView task, we can generalized the process
of collecting files from the view (which may or may not actually cache
the underlying file tree). I moved the implementation of collectFiles
and addBaseSources into the new FileManagement object because Defaults
is already too large of a file. When we query the view, we also need to
register the directory we're listing because if the underlying view is a
cache, we must call register before any entries will be available.
Because FileTreeDataView doesn't have a register method, I implement
registration with a simple implicit class that pattern matches on the
underlying type and only calls register if it is actually a
FileRepository.
A side effect of this change is that the underlying files returned by
collectFiles and appendBaseSources are StampedFile instances. This is so
that in a subsequent commit, I can add a Zinc external hook that will
read these stamps from the files in the source input array rather than
compute the stamp on the fly. This leads to a substantial reduction in
Zinc startup time for projects with many source files. The file filters
also may be applied more quickly because the isDirectory property (which
we check for all source files) is read from a cached value rather than
requiring a stat.
I had to update a few of the scripted tests to use the `1.2.0`
FileTreeViewConfig because those tests would copy a file and then
immediately re-compile. The latency of cache invalidation is O(1-10ms),
but not instantaneous so it's necessary to either use a non-caching
FileTreeView or add a sleep between updates and compilation. I chose the
former.
Every time that the compile task is run, there are potentially a large
number of iops that must occur in order for sbt to generate the source
file list as well as for zinc to check which files have changed since
the last build. This can lead to a noticeable delay between when a build
is started (either manually or by triggered execution) and when
compilation actually begins. To reduce this latency, I am adding a
global view of the file system that will be stored in
BasicKeys.globalFileTreeView.
To make this work, I introduce the StampedFile trait, which augments the
java.io.File class with a stamp method that returns the zinc stamp for
the file. For source files, this will be a hash of the file, while for
binaries, it is just the last modified time. In order to gain access to
the sbt.internal.inc.Stamper class, I had to append addSbtZinc to the
commandProj configurations.
This view may or may not use an in-memory cache of the file system tree
to return the results. Because there is always the risk of the cache
getting out of sync with the actual file system, I both make it optional
to use a cache and provide a mechanism for flushing the cache. Moreover,
the in-memory cache implementation in sbt.io, which is backed by a
swoval FileTreeRepository, has the property that touching a monitored
directory invalidates the entire directory within the cache, so the
flush command isn't even strictly needed in general.
Because caching is optional, the global is of a FileTreeDataView, which
doesn't specify a caching strategy. Subsequent commits will make use of
this to potentially speed up incremental compilation by caching the
Stamps of the source files so that zinc does not need to compute the
hashes itself and will allow for continuous builds to use the cache to
monitor events instead of creating a new, standalone FileEventMonitor.
It may be useful for users to be able to return their own custom
Action types in the Config callbacks. For a contrived example, a user
could add a jar file in the .ivy2 directory to the watch sources and
trigger a reboot full when that jar changes.
There may be instances where the user may wish to stop the watch if an
error occurs running the task. To facilitate this, I add boolean
parameter, lastStatus, to watchShouldTerminate. The value is computed by
modifying the state used to run the task to have a custom onFailure
command. If the task fails, the returned state will have the onFailure
command will be enqueued at the head of the remaining commands. The
result of the task then becomes true if the custom onFailure is not
present in the remaining commands and false if it is. We don't actually
run this command, so it's just implemented with the identity function.
I also updated Watched.watch to return an Action instead of Unit. This
enables us to return a failed state if Watched.watch returns
HandleError.
This commit reworks Watched to be more testable and extensible. It also
adds some small features. The previous implementation presented a number
of challenges:
1) It relied on external side effects to terminate the watch, which was
difficult to test
2) It exposed irrelevant implementation details to the user in the
methods that exposed the WatchState as a parameter.
3) It spun up two worker threads. One was to monitor System.in for user
input. The other was to poll the watch service for events and write
them to a queue. The user input thread actually broke '~console'
because nearly every console session will hit the <enter> key, which
would eventually cause the watch to stop when the user exited the
console.
To address (1), I add the shouldTerminate method to WatchConfig. This
takes the current watch iteration is input and if the function returns
true, the watch will stop.
To address (2), I replace the triggeredMessage and watchingMessage keys
with watchTriggeredMessage and watchStartMessage. The latter two keys
are functions that do not take the WatchState as parameters. Both
functions take the current iteration count as a parameter and the
watchTriggeredMessage also has a parameter for the path that triggered
the build.
To address (3), I stop using the sbt.internal.io.EventMonitor and
instead use the sbt.io.FileEventMonitor. The latter class is similar to
the former except that it's polling method accepts a duration, which may
be finite or infinite) and returns all of the events that occurred since
it was last polled. By adding the ability to poll for a finite amount of
time, we can interleave polling for events with polling System.in for
user input, all on the main thread. This eliminates the two extraneous
threads and fixes the '~console' use case I described before.
I also let the user configure the function that reads from System.in via
the watchHandleInput method. In fact, this method need not read from
System.in at all since it's just () => Watched.Action. The reason that
it isn't () => Boolean is that I'd like to leave open the option for the
ability to trigger a build via user input, not just terminating the
watch. My initial idea was to add the ability to type 'r' to re-build in
addition to <enter> to exit. This doesn't work without integrating
jline though because the input is buffered. Regardless, for testing
purposes, it gives us the ability to add a timeout to the watch by
making handleInput return true when a deadline expires.
The tests are a bit wonky because I still need to rely on side effects
in the logging methods to orchestrate the sequence of file events that
I'd like to test. While I could move some of this logic into a
background thread, there still needs to be coordination between the
state of the watch and the background thread. I think it's easier to
reason about when all of the work occurs on the same thread, even if it
makes these user provided functions impure.
I deprecated all of the previous watch related keys that are no longer
used with the new infrastructure. To avoid breaking existing builds, I
make the watchConfig task use the deprecated logging methods if they are
defined in the user's builds, but sbt will not longer set the default
values. For the vast majority of users, it should be straightforward to
migrate their builds to use the new keys. My hunch is that the of the
deprecated keys, only triggeredMessage is widely used (in conjunction
with the clear screen method) and it is dead simple to replace it with
watchTriggeredMessage.
Note: The FileTreeViewConfig class is not really necessary for this commit.
It will become more important in a subsequent commit which introduces an
optional global file system cache.
This commit makes watch event logging work in the '~' command. The
previous design of the command made this difficult, so there is a
significant re-design of the implementation of '~'. I believe that this
redesign will allow the feature to be maintained and improved more
easily moving forward. With the redesign, it is now possible to test the
business logic of the watch command (and I add a rudimentary test that I
will build upon in subsequent commits).
A bonus of this redesign is that now if the user tries to watch an
invalid command, the watch will immediately terminate with an error
rather than get stuck waiting for events when the task can never
possibly succeed.
The previous implementation of the '~' command makes it difficult
to dynamically control the implementation arguments because it is
implemented in the command project which makes it unable to depend on
any task keys that are defined in the build. It works around this by
putting all of it's configuration in the Watched attribute which is
stored globally. This would not have been necessary if the function had
been defined in the main project where it could just extract the value
of the watched task rather than relying on the global attribute value.
Moreover, because it cannot depend on tasks, it makes it nigh impossible
to use the logging framework within the '~' command.
Another issue with the previous implementation is that it's somewhat
difficult to reason about. The executeContinuously has effectively two
entry points: one for the first time the command is run and one for each
subsequent invocation when a new build is triggered. The successive
invocations are triggered by prepending commands to run to the previous
state. This is made recursive by prepending the initial command (that
was prefixed with '~'. Which branch we're in is determined by checking
for the existence of a temporary attribute, that we must ensure that we
remove when the build is stopped. This makes a lot of behavior non-local and
difficult for an outsider who is less familiar with sbt to understand.
Broadly, this refactor does two things:
1) Move the definition of continuous from BasicCommands to BuiltInCommands
2) Re-work the implementation to be executed in code rather than using
the sbt dsl.
The first part is simple. We just add an implementation of continuous to
BuiltInCommands and remove it from the list of BasicCommands. We need to
leave in the legacy implementation for binary compatibility. I also
moved all of the actual implementation logic into Watched, which makes
maintenance easier since most of the logic is in one place.
The second part is more complicated. Rather than rely on the sbt dsl
(e.g. `(ClearOnFailure :: next :: FailureWall :: repeat :: s)`) to
parse and run the command. We manually parse the command and generate a
task of type `() => State`. We don't actually need to do anything with
the generated state because we're going to return the original state at
the end of the command no matter what. With this task, we can then
create a tail recursive function that repeatedly executes the task until
the watch is terminated.
The parsing is handled in the Watch.command method (which is where I
moved the refactored BasicCommands.continuous implementation). The
actual task running and monitoring is handled in Watched.watch. This
method has no reference to the sbt state, which makes it testable. It sets
up an event monitor and then delegates the recursive monitoring to a
small nested function, Watched.watch.impl. One nice thing about this
approach is that it is very easy to reason about the life cycle of the
EventMonitor. The recursive call is within a try { } finally { } where
the monitor and stdin are guaranteed to be cleared at the end.
Adding support for a custom (and default) watch logger is trivial with
the new infrastructure and is done via the watchLogger TaskKey.
There was a small reporting race condition that was introduced by the
change to (2). Because the new implementation is able to bypass command
parsing for triggered builds, the watch message would usually end up
being printed before the task outcome was fully logged. To work around
this, I made the watch and triggered messages be logged rather than
printed directly to stdout. As a result, the only user visible result of
this change should be that instead of seeing:
"1. Waiting for source changes in project foo... (press enter to interrupt)",
users will now see:
"[info] 1. Waiting for source changes in project foo... (press enter to interrupt)".
There have been reports that often a new build will be triggered
immediately after the previous build even when none of the files have
been modified since the start of the last build. This can happen when,
for example, a program implements save with a rename. When that occurs,
a deletion watch event may trigger the build but the corresponding
creation event may be detected outside of the current 40ms window. By
bumping this value to 500ms, we hopefully prevent the majority of these
false triggers. For unusual workflows in which this longer quarantine
period is an issue, the setting can be overridden.
This change makes the temporary shared library that is created by
the swoval file-tree-views library to be extracted into the sbt global
base directory rather than the temp file. This way if there is a leak of
shared libraries, they can easily be found in ~/.sbt rather than in,
say, /tmp (or the osx/windows equivalent location). The extracted shared
library objects will be in the path ~/.sbt/swoval-jni. There is a
shutdown hook that removes them as well as a garbage collection process
that runs in the background whenever the swoval library is loaded, so
these shouldn't leak uncontrollably.
I had to turn off -Xfatal-warnings in commandProj because after updating
io, commandProj depends on the deprecated EventMonitor class. In #4335,
I stop using EventMonitor, but deprecate the Watched class which is both
defined and used (as an unused attribute key) in commandProj. I think we
can probably get rid of Watched in 1.4.x and certainly in a hypothetical
2.x, so hopefully we can restore -Xfatal-warnings sooner than later.
I also had to replace uses of IO.classLocationFile with
IO.classLocationPath to avoid compilation failures due to
-Xfatal-warnings.
Fixes#4362
This implements an instance of ExecuteProgress that is enabled by default to report progress on the shell.
Combined with the scroll up logger, this takes over the bottom lines of the terminal screen and display a count up clock of the currently executing task.