It is inefficient to be constantly allocating and filling an
ArrayBuffer. The buffer is only used for reading headers that we mostly
discard anyway. My assumption is that since we only care about content
length, it's fine to put a fixed limit on the buffer size.
When the user ran a command like `testOnly foo -- bar`, the client was
incorrectly treating the `--` as an sbt argument. The assumption is that
once an argument is found that does not start with a `-`, then
everything following that argument is part of the command arguments.
The existing implementation of watch did not work with the thin client.
In sbt 1.3.0, watch was changed to be a blocking command that performed
manual task evaluation. This commit makes the implementation more
similar to < 1.3.0 where watch modifies the state and after running the
user specified command(s), it enters a blocking command. The new
blocking command is very similar to the shell command.
As part of this change, I also reworked some of the internals of watch
so that a number of threads are spawned for reading file and input
events. By using background threads that write to a single event queue,
we are able to block on the file events and terminal input stream rather
than polling. After this change, the cpu utilization as measured by ps
drops from roughly 2% of a cpu to 0.
To integrate with the network client, we introduce a new UITask that is
similar to the AskUserTask but instead of reading lines and adding execs
to the command queue, it reads characters and converts them into watch
commands that we also append to the command queue.
With this new implementation, the watch task that was added in 1.3.0 no
longer works. My guess is that no one was really using it. It wasn't
documented anywhere. The motivation for the task implementation was that
it could be called within another task which would let users define a
task that monitors for file changes before running. Since this had never
been advertised and is only of limited utility anyway, I think it's fine
to break it.
I also had to disable the input-parser and symlinks tests. I'm not 100%
sure why the symlinks test was failing. It would tend to work on my
machine but fail in CI. I gave up on debugging it. The input-parser test
also fails but would be a good candidate to be moved to the client test
in the serverTestProj. At any rate, it was testing a code path that was
only exercised if the user changed the watchInputStream method which is
highly unlikely to have been done in any user builds.
The WatchSpec had become a nuisance and wasn't really preventing from
any regressions so I removed it. The scripted tests are how we test
watch.
The sbtc client can provide a ux very similar to using the sbt shell
when combined with tab completions. In fact, since some shells have a
better tab completion engine than that provided by jilne2, the
experience can be even better. To make this work, we add another entry
point to the thin client that is capable of generating completions for
an input string. It queries sbt for the completions and prints the
result to stdout, where they are consumed by the shell and fed into its
completion engine.
In addition to providing tab completions, if there is no server running
or if the user is completing `runMain`, `testOnly` or `testQuick`, the
thin client will prompt the user to ask if they would like to start an
sbt server or if they would like to compile to generate the main class
or test names. Neither powershell nor zsh support forwarding input to
the tab completion script. Zsh will print output to stderr so we
opportunistically start the server or complete the test class names.
Powershell does not print completion output at all, so we do not start a
server or fill completions in that case*. For fish and bash, we prompt
the user that they can take these actions so that they can avoid the
expensive operation if desired.
* Powershell users can set the environment variable SBTC_AUTO_COMPLETE
if they want to automatically start a server of compile for run and test
names. No output will be displayed so there can be a long latency
between pressing <tab> and seeing completion results if this variable is
set.
This commit adds the ability for sbt to automatically shut itself down
if it has been idle for some duration of time. The motivation is that
if the user may not realize they have an sbt server running in the
background that is using resources. We don't want to be too aggressive
with the idle timeout because that can reduce the efficacy of the thin
client. A value of one week is chosen so that users can enjoy a long
weekend and when they return to their computer, they won't have to
restart sbt. If they haven't used the server in at least a week, it
seems prudent to just kill it.
The sbtipcsocket by default restricts win32 named pipes to only allow
connections from the same login session. This makes connecting to a
remote server not work over ssh. We relax the default slightly in sbt to
allow the owner of the pipe to connect over any logon shell. The user
could restore the old behavior with:
```
Global / windowsServerSecurityLevel := Win32SecurityLevel.LOGON_DACL
```
or, if YOLO
```
Global / windowsServerSecurityLevel := Win32SecurityLevel.NO_SECURITY
```
This project is used to create client executables. The implementation is
pure java but we can build graalvm native-images from the java main
class. There are two versions of the client. One of them uses the
ipcsocket jni implementation to connect to the sbt server while the
other uses jna. It is necessary to use jni for the graalvm native-image
tool to work. Otherwise the two approaches should be identical.
When we start sbt with the thin client, we want to close the server io
streams after it loads so that the client exiting won't crash the
server. When we are running the server as part of the server tests, it
is nice to have the server output. By setting the --close-io-streams
flag when we launch the server in the client, we are able to achieve
both.
Running multi commands (input commands delimited by semi-colons) did not
work with the thin client. The commands would actually run on the
server, but the thin client would exit immediately without displaying
the output. The reason was that MainLoop would report the exec complete
when all it had done was split the original command into its constituent
parts and prepended them to the state command list. To work around this,
when we detect a network source command, we can remap its exec id to a
different id and only report the original exec id after the commands
complete. We also have to keep track of whether or not the command
succeeded or failed so that the reporting command reports the correct
result.
The way its implemented is with the the following steps:
1. set the terminal to the network terminal
2. stash the current onFailure so that we can properly report failures
3. add the new exec id to a map of the original exec id to the generated
id
4. actually run the command
5. if the command succeeds, add the original exec id to a result map
6. pop the onFailure
7. restore the terminal to console
8. report the result -- if the original exec id is in the result map we
report success. Otherwise we report failure.
There is also logic in NetworkChannel for finding the original exec id
if reporting one of the artificially generated exec ids because the
client will not be aware of that id.
When the user presses ctrl+c, we want to cancel any running tasks that
were initiated by that client. This is a bit tricky because we may not
be sure what is running if the client is in interactive mode. To work
around this, we send a cancellation request with the special id
__CancelAll. When the NetworkChannel receives this request, it cancels
the active task if was initiated by the client that sent the
cancellation request. The result it returns to the client indicates if
there were any tasks to be cancelled. If there were and the client was
in interactive mode, we do not exit. Otherwise we exit.
This commit integrates the NetworkClient with the server side rendered
ui. Rather than implementing its own shell method, it will now connect
to the server and register itself as a virtual terminal. If there are
command arguments, those will be sent to the server as execs. Otherwise
it will enter a shell mode where it just acts as a relay for io.
In batch mode, it will return the exit code of the last exec sent to the
server. If the server disconnects, the client will exit with an error code.
This commit makes it possible for the sbt server to render the same ui
to multiple clients. The network client ui should look nearly identical
to the console ui except for the log messages about the experimental
client.
The way that it works is that it associates a ui thread with each
terminal. Whenever a command starts or completes, callbacks are invoked
on the various channels to update their ui state. For example, if there
are two clients and one of them runs compile, then the prompt is changed
from AskUser to Running for the terminal that initiated the command
while the other client remains in the AskUser state. Whenever the client
changes uses ui states, the existing thread is terminated if it is
running and a new thread is begun.
The UITask formalizes this process. It is based on the AskUser class
from older versions of sbt. In fact, there is an AskUserTask which is
very similar. It uses jline to read input from the terminal (which could
be a network terminal). When it gets a line, it submits it to the
CommandExchange and exits. Once the next command is run (which may or
may not be the command it submitted), the ui state will be reset.
The debug, info, warn and error commands should work with the multi
client ui. When run, they set the log level globally, not just for the
client that set the level.
In the previous version of the NetworkClient, there was no feedback
while the client was starting up. It was also possible that if the
server had exited abruptly and there was a dead active.json portfile
left over, that the client wouldn't be able to start the server.
This commit reworks things so that we launch the server with a java
process and we print out the stdout, stderr streams from the process. We
also forward the client's stdin in case the server couldn't be started
and the user wants to retry or print the stacktrace.
This commit adds support for remote clients to connect to the sbt server
and attach themselves as a virtual terminal. In order to make this work,
each connection must send a json rpc request to attach to the server.
When this is received, the server will periodically query the remote
client to get the terminal properties and capabilities that allow the
remote client to act as a jline terminal proxy. There is also support
for json messages with ids sbt/systemIn and sbt/systemOut that allow io
to be relayed from the remote terminal to the sbt server and back.
Certain commands such as `exit` should be evaluated immediately. To make
this work, we add the concept of a MaintenanceTask. The CommandExchange
has a background thread that reads MaintenanceTasks and evaluates them
on demand. This allows maintenance tasks to be evaluated even when sbt
is evaluating an exec. If it weren't done this way, when the user typed
exit while a different remote connection was running a command, they
wouldn't be able to exit until the command completed.
The ServerIntents in ServerHandler did not handle
JsonRpcResponseMessage because prior to this commit, sbt clients were
primarily making requests to the server. But now the server sends
requests to the client for the terminal properties and terminal
capabilities so it was necessary to add an onResponse handler to
ServerIntent.
I had to move the network channel publishBytes method to run on a
background thread because there were scenarios in which the client
socket would get blocked because the server was trying to write on the
same thread that the read the bytes from the client.
To make the console command work, it is necessary to hijack the
classloader for JLine. In MetaBuildLoader, we put a custom forked JLine
that has a setter for the TerminalFactory singleton. This allows us to
change the terminal that is used by JLine in ConsoleReader. Without this
hack, the scala console would not work for remote clients.
Neither The ConsoleAppender class nor the jni based ClientSocket can be
used in a graalvm native image. This commit reworks the NetworkClient so
that we can avoid those limitations. It also adds some additional
command line argument parsing and changes the value of the run method to
return Int rather than Unit for exit code support.
In order to support a multi-client sbt server ux, we need to factor
`Terminal` out into a class instead of a singleton. Each terminal provides
and outputstream and inputstream. In all of the places where we were
previously relying on the `Terminal` singleton we need to update the
code to use `Terminal.get`, which will redirect io to the terminal whose
command is currently running.
This commit does not implement the server side ui for network clients.
It is just preparatory work for the multi-client ui.
The Terminal implementations have thread safe access to the output
stream. For this reason, I had to remove the sychronization on the
ConsoleOut lockObject. There were code paths that led to deadlock when
synchronizing on the lockObject.
We had similar code for reading json frames from an input stream in
NetworkChannel and ServerConnection. I reworked and consolidated this
logic into a shared method in ReadJsonFromInputStream.
This commit also removes the ObjectMessage reporting methods that
weren't doing anything.
The collectAnalysis task an be a bit slow and delays client connections
from running commands. This commit adds an option to skip the analysis
if it isn't needed. The default behavior is left as it was.
Initial draft for bsp support.
This shows two communication pattern around BSP.
First, if the request can be handled with the build knowledge is readily available in `NetworkChannel` we can reply immediately. `BuildServerImpl#onBspBuildTargets` is an example for that.
Second, if the request requires `State`, then we can forward the parameter into a custom command, and reply back from a command. `BuildServerProtocol.bspBuildTargetSources` is an example of that since it needs to invoke tasks to generate sources.
It is better that sbt not expose the implementation detail that
LineReader is implemented by JLine. Other terminal related apis should
be handled by sbt.internal.util.Terminal.
Presently if a server command comes in while in the shell, the client
output can appear on the same line as the command prompt and the command
prompt will not appear again until the user hits enter. This is a
confusing ux. For example, if I start an sbt server and type
the partial command "comp" and then start up a client and run the clean
command followed by a compile, the output looks like:
[info] sbt server started at local:///Users/ethanatkins/.sbt/1.0/server/51cfad3281b3a8a1820a/sock
sbt:scala-compile> comp[info] new client connected: network-1
[success] Total time: 0 s, completed Dec 12, 2019, 7:23:24 PM
[success] Total time: 0 s, completed Dec 12, 2019, 7:23:27 PM
[success] Total time: 2 s, completed Dec 12, 2019, 7:23:31 PM
Now, if I type "ile\n", I get:
[info] sbt server started at local:///Users/ethanatkins/.sbt/1.0/server/51cfad3281b3a8a1820a/sock
ile
[success] Total time: 0 s, completed Dec 12, 2019, 7:23:34 PM
sbt:scala-compile>
Following the same set of inputs after this change, I get:
[info] sbt server started at local:///Users/ethanatkins/.sbt/1.0/server/51cfad3281b3a8a1820a/sock
sbt:scala-compile> comp
[info] new client connected: network-1
[success] Total time: 0 s, completed Dec 12, 2019, 7:25:58 PM
sbt:scala-compile> comp
[success] Total time: 0 s, completed Dec 12, 2019, 7:26:14 PM
sbt:scala-compile> comp
[success] Total time: 1 s, completed Dec 12, 2019, 7:26:17 PM
sbt:scala-compile> compile
[success] Total time: 0 s, completed Dec 12, 2019, 7:26:19 PM
sbt:scala-compile>
To implement this change, I added the redraw() method to LineReader
which is a wrapper around ConsoleReader.drawLine; ConsoleReader.flush().
We invoke LineReader.redraw whenever the ConsoleChannel receives a
ConsolePromptEvent and there is a running thread.
To prevent log lines from being appended to the prompt line, in the
CommandExchange we print a newline character whenever a new command is
received from the network or a network client connects and we believe
that there is an active prompt.
The ask user thread is a background thread so it's fine for it to block
on System.in. By blocking rather than polling, the cpu utilization of
sbt drops to 0 on idle. We have to explicitly handle <ctrl+d> if we
block though because the JLine console reader will return null both if
the input stream returns -1
This commit aims to centralize all of the terminal interactions
throughout sbt. It also seeks to hide the jline implementation details
and only expose the apis that sbt needs for interacting with the
terminal.
In general, we should be able to assume that the terminal is in
canonical (line buffered) mode with echo enabled. To switch to raw mode
or to enable/disable echo, there are apis: Terminal.withRawSystemIn and
Terminal.withEcho that take a thunk as parameter to ensure that the
terminal is reset back to the canonical mode afterwards.
To demonstrate [-Yno-lub](http://eed3si9n.com/stricter-scala-with-ynolub), this shows the code changes that removes lubing (Not all subprojects are done).
After I made the changes, I switched the Scala back to normal 2.12.10.
During refactoring, these warnings got out of date. I also added
scaladoc to the watchTriggeredMessage key.
Ref: https://github.com/sbt/sbt/issues/5051.
Sometimes turbo mode didn't work correctly for projects where resources
were modified. This was because it was possible for the resource
classloader to inadvertently evict the dependency classloader from the
classloader cache because they had the same file stamps. There were two
fixes:
1) remove expired entries from the cache based on the
(Parent, Classpath) pair rather than just classpath
2) do not close the classloaders during cache eviction. They may still
be in use when we evict them so we need to wait until they are
explicitly closed elsewhere or until the go out of scope and are
collected by the CleanupThread
I tested this change with a spark project in which I kept modifying the
resources. Prior to this change, I could get into a state where if I
modified the resources, the dependency layer would get evicted every
time so the benefits of turbo mode were not realized.
There have been numerous issues with the multi parser incorrectly
splitting commands like `alias foo = ; bar` into
`"alias foo =" :: "bar" :: Nil`. To fix this, I update the multi parser
implementation to accept a list of commands that cannot be part of a
multi command. For now, the only excluded command is "alias", but if
other issues come up, we can add more. I also thought about adding a
system property for excluding more commands but it didn't seem worth the
maintenance cost at this point.
In addition to adding a filter for the excluded commands, I also
reworked the multi parser so that I think its more clear (and should
hopefully have more predictable performance). I changed the cmdPart
parser to accept empty strings. Prior to this, the parser explicitly
handled the non-leading semicolon and leading semicolon cases
separately. With the relaxed cmdPart, we can handle both cases with a
single parser. We just have to strip any empty commands at the beginning
or end of the command list.
It was reported in https://github.com/sbt/sbt/issues/4890 that cosmetic
white space could cause problems for the paser. I tracked this down to
primarily being because of the
`val semi = token(OptSpace ~> ';' ~> OptSpace)` line. This would cause
excessive backtracking. I added a test for a multi line command with a
lot of cosmetic whitespace that was adapted from #4890 except that I
made it even more taxing by running adding 100 commands instead of the
roughly 10 in the report. Before the parser changes, the test would
more or less block indefinitely. I never saw it successfully complete.
After these changes, it completes in 30-50ms (which drops to about 2-3
ms if the number of commands is dropped from 100 to 3).
I verified manually in a different project that a number of different
multi command completions still worked. In particular, I tested that
`~foo/test; foo/tes` would expand to `~foo/test; foo/test` which is one
of the hardest cases to get right.
I also added a few extra test cases for the parser since I wasn't sure
what the impact of removing the OptSpace ~> from the semi parser would
be.
We tried to prevent users from doing something like running a multi
command "foo; bar" where foo is valid but bar is invalid so that we
wouldn't run foo only to discover bar was an invalid key. It isn't
possible to know in general if any command other than the first command
in a multi command is valid because it might update the state and add
the initially invalid command.
The validation caused the intellij plugin to not work with 1.3.0-RC3.
The recent changes to make the multi parser strict broke any multi
command, or alias, where the multi command contained a command or task
that was not yet defined, but was possibly added by reload. This was
reported as #4869. I had had to work around this issue in ScriptedTests
by running `reload` and `setUpScripted` separately instead of as a multi
command. This workaround doesn't work for aliasing boot, which has been
a recommended approach by Mark Harrah since 2011.
To fix this, I relax the strict parser. We don't require that the parser
be valid to create a multi command string. In the multiApplied state
transformation, however, we validate all of the commands up to 'reload'.
Since there is no way to validate any commands to the right of 'reload,
we optimistically allow those commands to run.
So long as there is no 'reload' in the multi commands, all of the
commands will be validated.
The multi parser had very poor performance if there were many commands.
Evaluating the expansion of something like "compile;" * 30 could cause
sbt to hang indefinitely. I believe this was due to excessive
backtracking due to the optional `(parser <~ semi.?).?` part of the
parser in the non-leading semicolon case.
I also reworked the implementation so that the multi command now has a
name. This allows us to partition the commands into multi and non-multi
commands more easily in State while still having multi in the command
list. With this change, builds and plugins can exclude the multi parser
if they wish.
Using the partitioned parsers, I removed the high/priority low priority
distinction. Instead, I made it so that the multi command will actually
check if the first command is a named command, like '~'. If it is, it
will pass the raw command argument with the named command stripped out
into the parser for the named command. If that is parseable, then we
directly apply the effect. Otherwise we prefix each multi command to the
state.
We run into issues if we naively split the command input on ';' and
treat each part as a separate command unless the ';' is inside of a
string because it is also valid to have ';'s inside of braced
expressions, e.g. `set foo := { val x = 1; x + 1 }`. There was no parser
for expressions enclosed in braces. I add one that should parse any
expression wrapped in braces so long as each opening brace is matched by a
closing brace. The parser returns the original expression. This allows
the multi parser to ignore ';' inside of '{...}'.
I had to rework the scripted tests to individually run 'reload' and
'setUpScripted' because the new parser rejects setUpScripted because it
isn't a valid command until reload has run.
It was reported in https://github.com/sbt/sbt/issues/4808 that compared
to 1.2.8, sbt 1.3.0-RC2 will truncate the command args of an input task
that contains semicolons. This is actually intentional, but not
completely robust. For sbt >= 1.3.0, we are making ';' syntactically
meaningful. This means that it always represents a command separator
_unless_ it is inside of a quoted string. To enforce this, the multi parser
will effectively split the input on ';', it will then validate that each
command that it extracted is valid. If not, it throws an exception. If
the input is not a multi command, then parsing fails with a normal
failure.
I removed the multi command from the state's defined commands and reworked
State.combinedParser to explicitly first try multi parsing and fall back
to the regular combined parser if it is a regular command. If the multi
parser throws an uncaught exception, parsing fails even if the regular
parser could have successfully parsed the command. The reason is so that
we do not ever allow the user to evaluate, say 'run a;b'. Otherwise the
behavior would be inconsitent when the user runs 'compile; run a;b'
There was an incomplete pattern match that assumed that the jars in the
scala provider included one with the name "scala-library.jar". In
practice, I think this is always true, but it's safer to have a fallback
case and it also removes the compiler warning.
At some point I noticed that projects with no scala sources in the build
loaded significantly faster than projects that had even a single scala
file -- no matter how simple that file was. This didn't really make
sense to me because *.sbt files _do_ have to be compiled. I finally
realized that classloading was a likely bottle neck because *.sbt
files are compiled on the sbt classpath while *.scala files are compiled
with a different classloader generated by the classloader cache. It then
occurred to me that we could pre-fill the classloader cache with the
scala layer of the sbt metabuild classloader.
I found that compared to 1.3.0-M5, a project with a simple scala file in
the project directory loaded about 2 seconds faster after this change.
Even if there are no scala sources in the build.sbt, there is a similar
performance improvement for running "sbt compile", which I found exited
2-3 seconds faster after this change.
The Reload exception that I added in the sbt package really wasn't
intended to be public. It's only meant to be used by
checkMetaBuildSources, which the users shouldn't override. I put it in
the top package though because I wanted it to be next to FullReload. I
also am not sure why the Reload object in Watch was private[sbt], but
while writing documentation, I realized that users couldn't access it.
The docs for ClassLoader,
https://docs.oracle.com/javase/8/docs/api/java/lang/ClassLoader.html
say that all non-hierarchical custom classloaders should be registered
as parallel capable. The docs also suggest that custom classloaders
should try to only override findClass so I reworked LayerdClassLoader to
only override findClass. I also added locking to the class loading to
make it safe for concurrent loading.
All of the custom classloaders besides LayeredClassLoader either
subclass URLClassLoader or LayeredClassLoader but don't override
loadClass. Because those two classloaders are parallel capable, the
subclasses should be as well. It isn't possible to make classloaders
that are implemented in scala parallel capable because scala 2 doesn't
support jvm static blocks (dotty does support this with an annotation).
To work around this, I re-worked some of the classloaders so that they
are either directly implemented in java or I subclassed a scala
implementation class in java.
I noticed that sbt 1.3.0 was using more cpu when idling (either at the
shell or while waiting for file events) than 1.2.8. This was because I'd
reduced a number of timeouts to 2 milliseconds which was causing a
thread to keep waking up every 2 milliseconds to poll a queue. I thought
that this was cheaper than it actually is and drove the cpu utilization
to O(10%) of a cpu on my mac.
To address this, I consolidated a number of queues into a single queue
in CommandExchange and Continuous. In the CommandExchange case, I
reworked CommandChannel to have a register method that passes in a Queue
of CommandChannels. Whenever it appends an exec, it adds itself to the
queue. CommandExchange can then poll that queue directly and poll the
returned CommandChannel for the actual exec. Since the main thread is
blocking on this queue, it does not need to frequently wake up and can
just poll more or less indefinitely until a message is received. This
also reduces average latency compared to older versions of sbt since
messages will be processed almost as soon as they are received.
The continuous case is slightly more complicated because we are polling
from two sources, stdin and FileEventMonitor. In my ideal world, I'd
have a reactive api for both of those sources and they would just write
events to a shared queue that we could block on. That is nontrivial to
implement, so instead I consolidated the FileEventMonitor instances into
a single FileEventMonitor. Since there is now only one FileEventMonitor
queue, we can block on that queue for 30 milliseconds and the poll
stdin. This reduces cpu utilization to O(2%) on my machine while still
having reasonably low latency for key input events (the latency of file
events should be close to zero since we are usually polling the
FileEventMonitor queue when waiting).
I actually had a TODO about the FileEventMonitor change that this
resolves.
This check doesn't actually make sense anymore with the new
ClassLoaderCache. In the old ClassLoaderCache, there were separate
layers for the snapshots and regular jars. The test was verifying that
only the snapshot layer was invalidated but now there is just one layer.
The dotty sbt-bridge module assumes that it's going to get a
URLClassLoader from which it can extract all of the classpath urls. That
doesn't work with the old wrapped classloader because its classpath was
empty. As a nasty workaround, I override the getURLs method, which is
where it gets the URLs from. After this change the
compiler-project/dotty-compiler-plugin test passes.
This commit adds a new ClassLoaderCache that builds on the
ClassLoaderCache that is present in zinc (and can be used to build an
instance of the zinc ClassLoaderCache to preserve compatibility). It
differs from the zinc classloader cache that it does not use direct
SoftReferences to classloaders. Instead, we create a wrapper loader
that can't load any classes and just delegates to its parent. This
allows us to add a thread that reaps the soft reference to the wrapper
loader. Crucially, we add a custom SoftReference class that has a strong
reference to the underlying classloader. This allows us to call close on
the strong reference.
The one issue with this approach is that we can't
rescue the jvm from crashing with an OOM: metaspace because the jvm
doesn't give us a chance to close and dereference the underlying
classloaders before it crashes. It WILL collect classloaders under
normal memory pressure, just not metaspace pressure. To fix this, I
check if the MaxMetaspaceSize is set via an MxBean and, if it is, we
fill the cache with regular soft references. We are going to change the
bash script to not set -XX:MaxMetaspaceSize by default so most builds
should probably end up correctly closing the classloaders after this
change. But we should break existing builds that set MaxMetaspaceSize
but don't crash.
As part of this commit, I audited all of the places where we were
instantiating ClassLoaderCache instances and instead pass in the
state's ClassLoaderCache instance. This reduces the total number of
classloaders created.
This commit finally fixes#241 by adding support for sbt to either
print a warning or automatically reload the project if the metabuild
sources have changed. To facilitate this, I introduce a new key,
metaBuildSourceOption which has three options:
1) IgnoreSourceChanges
2) WarnOnSourceChanges
3) ReloadOnSourceChanges
When the former is set, sbt will not check if the meta build sources
have changed. Otherwise, sbt will use the buildStructure / fileInputs to
get the ChangedFiles for the metabuild. If there are any changes, it
will either warn or reload the build depending on the value of
metaBuildSourceOption.
The mechanism for diffing the files is that I add a step to EvaluateTask
where, if the project has been loaded and
metaBuildSourceOption != IgnoreSourceChanges, we evaluate the needReload
task. If we need a reload, we return an error that indicates that a
Reload is necessary. When that error is detected, the MainLoop will
prepend "reload" to the pending commands for the state. Otherwise we
just print a warning and continue.
I benchmarked the overhead of this and it wasn't too bad. I generally
saw it taking 5-20ms to perform the check. Since this is only done once
per task evaluation run, I don't think it's a big deal. When
IgnoreSourceChanges is set, there is O(10us) overhead. If performance
does become a problem, we could add a global watch service and skip the
needReload evaluation if no files have been modified.
I removed the watchTrackMetaBuild key and made it so that the continuous
builds only track the meta build when
metaBuildSourceOption == ReloadOnSourceChanges
The newest version of io repackages a number of classes into the
sbt.nio.* packages. It also changes some of the semantics of glob
related apis. This commit updates all of the usages of the updated apis
within sbt but should have no functional difference.
Since the new watch implementation has yet to be widely deployed, we
should hold off on deprecating the old keys. They could still be
deprecated in a patch release or in 1.4.0.
This commit reworks the watch start message so that instead of printing
something like:
[info] [watch] 1. Waiting for source changes... (press 'r' to re-run the command, 'x' to exit sbt or 'enter' to return to the shell)
it instead prints something like:
[info] 1. Monitoring source files for updates...
[info] Project: filesJVM
[info] Command: compile
[info] Options:
[info] <enter>: return to the shell
[info] 'r': repeat the current command
[info] 'x': exit sbt
It will also print which path triggered the build.
I decided that it makes sense to move all of the new watch code out of
the Watched companion object since the Watched trait itself is now
deprecated. I don't really like having the new code in Watched.scala
mixed with the legacy code, so I pulled it all out and moved it into the
Watch object. Since we have to put all of the logic for the Continuous
object in main in order to access the sbt.Keys object, it makes sense to
move the logic out of main-command and into command so that most of the
watch related logic is in the same subproject.
This is a huge refactor of Watched. I produced this through multiple
rewrite iterations and it was too difficult to separate all of the
changes into small individual commits so I, unfortunately, had to make a
massive commit. In general, I have tried to document the source code
extensively both to facilitate reading this commit and to help with
future maintenance.
These changes are quite complicated because they provided a built-in
like api to a feature that is implemented like a plugin. In particular,
we have to manually do a lot of parsing as well as roll our own
task/setting evaluation because we cannot infer the watch settings at
project build time because we do not know a priori what commands the
user may watch in a given session. The dynamic setting and task
evaluation is mostly confined to the WatchSettings class in Continuous.
It feels dirty to do all of this extraction by hand, but it does seem to
work correctly with scopes.
At a high level this commit does four things:
1) migrate the watch implementation to using the InputGraph to collect
the globs that it needs to monitor during the watch
2) simplify WatchConfig to make it easier for plugin authors to write
their own custom watch implementations
3) allow configuration of the watch settings based on the task(s) that
is/are being run
4) adds an InputTask implemenation of watch.
Point #1 is mostly handled by Point #3 since I had to overhaul how _all_
of the watch settings are generated. InputGraph already handles both
transitive inputs and triggers as well as legacy watchSources so not
much additional logic is needed beyond passing the correct scoped keys
into InputGraph.
Point #3 require some structural changes. The watch settings cannot in
general be defined statically because we don't know a priori what tasks
the user will try and watch. To address this, I added code that will
extract the task keys for all of the commands that we are running. I
then manually extract the relevant settings for each command. Finally, I
aggregate those settings into a single WatchConfig that can be used to
actually implement the watch. The aggregation is generally
straightforward: we run all of the callbacks for each task and choose
the next watch state based on the highest priority Action that is
returned by any of the callbacks.
Because I needed Extracted to pull out the necessary settings, I was
forced to move a lot of logic out of Watched and into a new singleton,
Continuous, that exists in the main project (Watched is in the command
project). The public footprint of Continuous is tiny. Even though I want
to make the watch feature flexible for plugin authors, the
implementation and api remain a moving target so I do not want to be
limited by future binary compatibility requirements. Anyone who wants to
live dangerously can access the private[sbt] apis via reflection or by
adding custom code to the sbt package in their plugin (a technique I've
used in CloseWatch).
Point #2 is addressed by removing the count and lastStatus from the
WatchConfig callbacks. While these parameters can be useful, they are
not necessary to implement the semantics of a watch. Moreover, a status
boolean isn't really that useful and the sbt task engine makes it very
difficult to actually extract the previous result of the tasks that were
run. After this refactor, WatchConfig has a simpler api. There are fewer
callbacks to implement and the signatures are simpler. To preserve the
_functionality_ of making the count accessible to the user specifiable
callbacks, I still provided settings like watchOnInputEvent that accept
a count parameter, but the count is actually tracked externally to
Watched.watch and incremented every time the task is run.
Moreover, there are a few parameters of the watch: the logger and
transitive globs, that cannot be provided via settings. I provide
callback settings like watchOnStart that mirror the WatchConfig
callbacks except that they return a function from Continuous.Arguments
to the needed callback. The Continuous.aggregate function will check if
the watchOnStart setting is set and if it is, will pass in the needed
arguments. Otherwise it will use the default watchOnStart implementation
which simulates the existing behavior by tracking the iteration count in
an AtomicInteger and passing the current count into the user provided
callback. In this way, we are able to provide a number of apis to the
watch process while preserving the default behavior.
To implement #4, I had to change the label of the `watch` attribute key
from "watch" to "watched". This allows `watch compile` to work at the
sbt command line even thought it maps to the watchTasks key. The actual
implementation is almost trivial. The difference between an
InputTask[Unit] and a command is very small. The tricky part is that the
actual implementation requires applying mapTask to a delegate task that
overrides the Task's info.postTransform value (which is used to
transform the state after task evaluation). The actual postTransform
function can be shared by the continuous task and continuous command.
There is just a slightly different mechanism for getting to the state
transformation function.
I realized that Stamped.File was a bad interface that was really just an
implementation detail of external hooks. I updated the
GlobLister.{ all, unique } methods to return Seq[(Path, FileAttributes)]
rather than Stamped.File which is a much more natural api and one I
could see surviving the switch to nio based apis planned for
1.4.0/2.0.0. I also added a simple scripted test for glob listing. The
GlobLister.all method is implicitly tested all over the place since the
compile task uses it, but it's good to have an explicit test.
I decided that FileCacheEntry was a bad name because the methods did not
necessarily have anything to do with caching. Moreover, because it is
exposed in a public interface, it shouldn't be in the internal package.
Rather than exposing the FileEventMonitor.Event types, which are under
active development in the io repo, I am adding a new event trait to
FileCacheEntry. This trait doesn't expose any internal implementation
details.
The equals method didn't work exactly the way I thought. By delegating
to the equivStamp object in sbt we can be more confident that it is
actually comparing the stamp values and not object references or
some other equals implementation.
Right now, the sbt.internal.io.Source is something of a second class
citizen within sbt. Since sbt 0.13, there have been extension classes
defined that can convert a file to a PathFinder but no analog has been
introduced for sbt.internal.io.Source.
Given that sbt.internal.io.Source was not really intended to be part of
the public api (just look at its package), I think it makes sense to
just replace it with Glob. In this commit, I add extension
methods to Glob and Seq[Glob] that make it possible to easily
retrieve all of the files for a particular Glob within a task. The
upshot is that where previously, we'd have had to write something like:
watchSources += Source(baseDirectory.value / "src" / "main" / "proto", "*.proto", NothingFilter)
now we can write
watchGlobs += baseDirectory.value / "src" / "main" / "proto" * "*.proto"
Moreover, within a task, we can now do something like:
foo := {
val allWatchGlobs: Seq[File] = watchGlobs.value.all
println(allWatchSources.mkString("all watch source files:\n", "\n", ""))
}
Before we would have had to manually retrieve the files.
The implementation of the dsl uses the new GlobExtractor class which
proxies file look ups through a FileTree.Repository. This makes it so
that, by default, all file i/o using Sources will use the default
FileTree.Repository. The default is a macro that returns
`sbt.Keys.fileTreeRepository.value: @sbtUnchecked`. By doing it this
way, the default repository can only be used within a task definition
(since it delegates to `fileTreeRepository.value`). It does not,
however, prevent the user from explicitly providing a
FileTree.Repository instance which the user is free to instantiate
however they wish.
Bonus: optimize imports in Def.scala and Defaults.scala
The FileTreeViewConfig abstraction that I added was somewhat unwieldy
and confusing. The original intention was to provide users with a lot of
flexibility in configuring the global file tree repository used by sbt.
I don't think that flexibility is necessary and it was both conceptually
complicated and made the implementation complex. In this commit, I add a
new boolean flag enableGlobalCachingFileTreeRepository that toggles
which kind of FileTreeRepository to use globally.
There are actually three kinds of repositories that could be returned:
1) FileTreeRepository.default -- this caches the entire file system
tree it hooks into the cache's event callbacks to create a file event
monitor. It will be used if enableGlobalCachingFileTreeRepository is
true and Global / pollingGlobs := Nil
2) FileTreeRepository.hybrid -- similar to FileTreeRepository.default
except that it will not cache any files that are included in
Global / pollingGlobs. It will be used if
enableGlobalCachingFileTreeRepository is true and
Global / pollingGlobs is non empty
3) FileTreeRepository.legacy -- does not cache any of the file system
tree, but does maintain a persistent file monitoring process that is
implemented with a WatchServiceBackedObservable. Because it doesn't
poll, in general, it's ok to leave the monitoring on in the
background. One reason to use this is that if there are any issues
with the cache being unable to accurately mirror the underlying file
system tree, this repository will always poll the file system
whenever sbt requests the entries for a given glob. Moreover, the
file system tree implementation is very similar to the implementation
that was used in 1.2.x so this gives users a way to almost fully opt
back in to the old behavior.
This new version of io breaks source and binary compatibility everywhere
that uses the register(path: Path, depth: Int) method that is defined on
a few interfaces because I changed the signature to register(glob:
Glob). I had to convert to using a glob everywhere that register was
called.
I also noticed a number of places where we were calling .asFile on a
file. This is redundant because asFile is an extension method on File
that just returns the underlying file.
Finally, I share the IOSyntax trait from io in AllSyntax. There was more
or less a TODO suggesting this change. The one hairy part is the
existence of the Alternative class. This class has unfortunately somehow
made it into the sbt package object. While I doubt many plugins are
using this, it doesn't seem worth breaking binary compatibility to get
rid of it. The issue is that while Alternative is defined private[sbt],
the alternative method in IOSyntax is public, so I can't get rid of
Alternative without breaking binary compatibility.
I'm not deprecating Alternative for now because the sbtProj still has
xfatal warnings on. I think in many, if not most, cases, the Alternative
class makes the code more confusing as is often the case with custom
operators. The confusion is mitigated if the abstraction is used only in
the file in which it's defined.
Previously, we were leaking the internal details of incremental
compilation to users by defining FileTree(DataView|Repository)[Stamp].
To avoid this, I introduce the new class FileCacheEntry that is quite
similar to Stamp except defined using scala Options rather than java
Optionals. The implementation class just delegates to an actual Stamp
and I provided a private[sbt] ops class that adds a
method `stamp` to FileCacheEntry. This will usually just extract the
stamp from the implementation class. This allows us to use
FileCacheEntry almost interchangeably with Stamp while still avoiding
exposing users to Stamp.
In the FileTreeDataView use case, we were previously working with
FileTreeDataView[Stamped], which actually contained a lot of redundant
information because FileTreeDataView.Entry[_] has a toTypedPath method
that could be used to read the path related fields in Stamped. Instead,
we can just return the Stamp itself in FileTreeDataView.list* methods
and convert to Stamped.File where needed (i.e. in ExternalHooks).
Also move BasicKeys.globalFileTreeView to Keys since it isn't actually
used in the main-command project.
We want the user to be able to invalidate the classloader cache in the
event that it somehow gets in a bad state. The cache is, however,
defined in multiple configurations, so there are in fact many
ClassLoaderCache instances that are managed by sbt. To make this sane, I
add a global cache that is keyed by a TaskKey[_] and can return
arbitrary data back. Invalidating all of the ClassLoaderCache instances
is then as straightforward as just replacing the TaskRepository
instance.
I also went ahead and unified the management of the global file tree
repository. Instead of having to specifically clear the file tree
repository or the classloader cache, the user can now invalidate both
with the new clearCaches command.