Although log manager is in the internal package, akka uses it and it
seems worth preserving binary compatibility in their build, at least for
sbt 1.4.0-M2. Once the community build is passing, we can consider
reverting this.
```
[info] welcome to sbt 1.4.0-SNAPSHOT (AdoptOpenJDK Java 1.8.0_232)
[info] loading settings for project global-plugins from ...
[info] loading global plugins from ...
[info] loading project definition from /private/tmp/hello/project
[info] loading settings for project root from build.sbt ...
[info] set current project to hello (in build file:/private/tmp/hello/)
[info]
[info] Here are some highlights of this release:
[info] - Build server protocol (BSP) support
[info] - sbtn: a native thin client for sbt
[info] - VirtualFile + RemoteCache: caches build artifacts across different machines
[info] - Incremental build pipelining. Try it using `ThisBuild / usePipelining := true`.
[info] See http://eed3si9n.com/sbt-1.4.0-beta for full release notes.
[info] Hide the banner for this release by running `skipBanner`.
[info] sbt server started at local:///Users/eed3si9n/.sbt/1.0/server/478e6db75688771ddcf1/soc
```
Frequently ctrl+c does not work to cancel the running tasks. This seems
to be because the signal handler is bound to a specific instance of
evaluate task but there may be multiple instances of evaluate task
running at any given time. Shutting down just one of the running engines
does not ensure that task evaluation stops. To work around this, we can
globally store all of the completion services in a weak hash map and
cancel all of them whenever a signal is received. Closing the service,
which happens at the end of task evaluation will remove the service from
the map so hopefully this shouldn't introduce a memory leak.
While dogfooding the latest sbt, I noticed that sometimes the watch
input threads leak. I suspect this happens when a build is immediately
triggered by a file that was modified during compilation. Though I
didn't fully verify this, it's likely that we interrupted the input
reading thread before it actually started reading. When it started
reading after the interrupt, it would block until the user entered
another input character. The result was that the zombie thread would
effectively steal the next character from the input stream which
manifested as the first character being ignored when the user tried to
enter a watch input option. If more than one thread leaked, then it may
take a number of keystrokes before the user regained control.
To fix this, we can ensure that all watch related threads are joined
before we exit watch. To avoid completely blocking the ui, we only try
to join the threads for a second and print an error if the join fails.
This shouldn't be the case so if users see this error, then we need to
fix the bug.
In 0d2b00c7e9, I introduced a setting for
the virtual file defines class cache to avoid ooms coming from zinc
stamping the project jar files. I introduced that cache at the compile
level though rather than global level and crashes were still occurring
in the sbt build. It was very easy to induce a crash on my computer by
running compile a few times, reload and then compile again. After making
the cache global, the crashes went away.
This attempts to delay the initialization of Coursier cache, such that it will not trigger Coursier directory related code if `ThisBuild / useCoursier` or `-Dsbt.coursier` is set to `false`.
With the latest sbt snapshot, the ui would get stuck if the user entered
an empty command. They would be presented with an empty prompt and could
not input any commands. This was caused by the change in
d569abe70a that reset the prompt after a
line was read. I had tried to optimize line reading by ignoring empty
commands in UITask.readline so we wouldn't have to make a new thread.
This optimization wasn't really buying much since it only affects how
quickly the user is reprompted after entering an empty command. Unless a
user is spamming the <enter> key, they shouldn't notice a difference.
This commit adds a few options to supershell:
1. Max items -- sets the max number of tasks to display in the progress
reports. It is pretty hard to read more than a few items in the
progress reports so I set the default limit to 8 and made that
configurable via the superShellMaxTasks parameter. If there are more
than the limit, there is an additional line telling how many additional
tasks are running
2. sleep -- sets how long to sleep between reports. The default is 500ms
to ensure that it updates at least once per second but the previous
value of 100ms is more frequent than necessary
3. threshold -- sets the minimum duration a task has to run before being
printed in the progress reports. The default threshold is increased
from 10ms to 100ms. This introduces a delay of threshold milliseconds
before any progress lines appear and also means that if no tasks ever
exceed the threshold, then no progress is ever displayed.
It turns out that task progress actually introduces a fair bit of
overhead. The biggest issue is that the task progress callbacks block
the Execute main thread. This means that time in those callbacks
delays task evaluation, slowing down sbt. This was not negligible, I was
seeing a lot of the total time of a no-op compile in
https://github.com/jtjeferreira/sbt-multi-module-sample was spent in
TaskProgress callbacks. Prior to these changes, I ran 30 no-op compiles
in that project and the average time was about 570ms. This number got
worse and worse because there were memory leaks in the TaskProgress
object. After these changes, it dropped to 250ms and after jit-ing, it
would drop to about 200ms. I also successfully ran 5000 consecutive
no-op compiles without leaking any memory.
A lot of the overhead of task progress was in adding tasks to the
timings map in AbstractTaskProgress. Tasks were never removed and
ConcurrentHashMap insertion time is proportional to the size of the map
(not sure if it's linear, quadratic or other) which was why sbt actually
got slower and slower the longer it ran. Much of the time was spent
adding tasks to the progress timings.
To fix this, I did something similar to what I did to manage logger
state in https://github.com/jtjeferreira/sbt-multi-module-sample. In
MainLoop, we create a new TaskProgress instance before command
evaluation and clean it up after. Earlier I made TaskProgress an object
to try to ensure there was only one progress thread at a time, and that
introduced the memory leak. In addition to removing the leak, I was able
to improve performance by removing tasks from the timings map when they
completed. Unlike TaskTimings and TaskTraceEvent, we don't care about
tasks that have completed for TaskProgress so it is safe to remove them.
In addition to the memory leaks, I also reworked how the background
threads work. Instead of having one thread that sleeps and prints
progress reports, we now use two single threaded executors. One is a
scheduled executor that is used to schedule progress reports and the
other is the actual thread on which the report is generated. When
progress starts, we schedule a recurring report that is generated every
sleep interval until task evaluation completes. Whenever we add a new
task, if we have haven't previously generated a progress report, we
schedule a report in threshold milliseconds. If the task completes
before the threshold period has elapsed, we just cancel the schedule
report. By doing things this way, we reduce the total number of reports
that are generated. Because reports need to effectively lock System.out,
the less we generate them, the better.
I also modified the internal data structures of AbstractTaskProgress so
that there is a single task map of timings instead of one map for
timings and one for active tasks.
It was a bit tricky to reason about the state of the prompt for a
terminal. To help make things more clear, I reworked things so that the
LineReader always sets the prompt to Pending after it reads a command.
In MainLoop, we cache the prompt value and temporarily set it to Running
while the command is running, which is really how it should have always
been.
In order to make the console task work with scala 2.13 and the thin
client, we need to provide a way for the scala repl to use an sbt
provided jline3 terminal instead of the default terminal typically built
by the repl. We also need to put jline 3 higher up in the classloading
hierarchy to ensure that two versions of jline 3 are not loaded (which
makes it impossible to share the sbt terminal with the scala terminal).
One impact of this change is the decoupling of the version of
jline-terminal used by the in process scala console and the version
of jline-terminal specified by the scala version itself. It is possible
to override this by setting the `useScalaReplJLine` flag to true. When
that is set, the scala REPL will run in a fully isolated classloader. That
will ensure that the versions are consistent. It will, however, for sure
break the thin client and may interfere with the embedded shell ui.
As part of this work, I also discovered that jline 3 Terminal.getSize is
very slow. In jline 2, the terminal attributes were automatically cached with a
timeout of, I think, 1 second so it wasn't a big deal to call
Terminal.getAttributes. The getSize method in jline 3 is not cached and
it shells out to run a tty command. This caused a significant
performance regression in sbt because when progress is enabled, we call
Terminal.getSize whenever we log any messages. I added caching of
getSize at the TerminalImpl level to address this. The timeout is 1
second, which seems responsive enough for most use cases. We could also
move the calculation onto a background thread and have it periodically
updated, but that seems like overkill.
There are cases where if the ui state is changing rapidly, that an
AskUserThread can be created and cancelled in a short time windows. This
could cause problems if the AskUserThread is interrupted during
`LineReader.createReader` which I think can shell out to run some
commands so it is relatively slow. If the thread was interrupted during
the call to `LineReader.createReader` and the interruption was not
handled, then the thread would go into `LineReader.readLine`, which
wouldn't exit until the user pressed enter. This ultimately caused the
ui to break until enter because this zombie line reader would be holding
the lock on the terminal input stream.
Prior to these changes, sbt was leaking large amounts of memory via
log4j appenders. sbt has an unusual use case for log4j because it
creates many ephemeral loggers while also having a global logger that is
supposed to work for the duration of the sbt session. There is a lot of
shared global state in log4j and properly cleaning up the ephemeral task
appenders would break global logging. This commit fixes the behavior by
introducing an alternate logging implementation. Users can still use the
old log4j logging implementation but it will be off by default. The
internal implementation is very simple: it just blocks the current
thread and writes to all of the appenders. Nevertheless, I found the
performance to be roughly identical to that of log4j in my sample
project. As an experiment, I did the appending on a thread pool and got
a significant performance improvement but I'll defer that to a later PR
since parallel io is harder to reason about.
Background: I was testing sbt performance in
https://github.com/jtjeferreira/sbt-multi-module-sample and noticed that
performance rapidly degraded after I ran compile a few times. I took a
heap dump and it became obvious that sbt was leaking console appenders.
Further investigation revealed that all of the leaking appenders in the
project were coming from task streams. This made me think that the fix
would be to track what loggers were created during task evaluation and
clear them out when task evaluation completed. That almost worked except
that log4j has an internal append only data structure containing logger
names. Since we create unique logger names for each run, that internal
data structure grew without bound. It looked like this could be worked
around by creating a new log4j Configuration (where that data structure
was stored) but while creating new configurations with each task runs
did fix the leak, it also broke global logging, which was using a
different configuration. At this point, I decided to write an alternate
implementation of the appender api where I could be sure that the
appenders were cleaned up without breaking global logging.
Implementation: I made ConsoleAppender a trait and made it no longer
extends log4j AbstractAppender. To do this, I had to remove the one
log4j specific method, append(LogEvent). ConsoleAppender now has a
method toLog4J that, in most cases, will return a log4j Appender that is
almost identical to the Appenders that we previously used. To manage
the loggers created during task evaluation, I introduce a new class,
LoggerContext. The LoggerContext determines which logging backend to use
and keeps track of what appenders and loggers have been created. We can
create a fresh LoggerContext before each task evaluation and clear it
out, cleaning up all of its resources after task evaluation concludes.
In order to make this work, there were many places where we need to
either pass in a LoggerContext or create a new one. The main magic is
happening in the `next(State)` method in Main. This is where we create a
new LoggerContext prior to command evaluation and clean it up after the
evaluation completes.
Users can toggle log4j using the new useLog4J key. They also can set the
system property, sbt.log.uselog4j. The global logger will use the sbt
internal implementation unless the system property is set.
There are a fairly significant number of mima issues since I changed the
type of ConsoleAppender. All of the mima changes were in the
sbt.internal package so I think this should be ok.
Effects: the memory leaks are gone. I successfully ran 5000 no-op
compiles in the sbt-multi-module-sample above with no degradation of
performace. There was a noticeable degradation after 30 no-op compiles
before.
During the refactor, I had to work on TestLogger and in doing so I also
fixed https://github.com/sbt/sbt/issues/4480.
This also should fix https://github.com/sbt/sbt/issues/4773
Zinc frequently needs to check the library classpath to ensure that
class names are defined in a given jar. There is a cost to looking up
the class names in the jar so it's a benefit to cache this across runs
so that we don't have to redo the same work every time. More
importantly, in testing with the latest sbt HEAD, I found that sbt would
crash fairly frequently because it ran out of direct memory, which is
used by nio to read and write to native memory without copying. The
direct memory area is shared with the java heap and if it reaches the
limit, the jvm crashes hard as though kill -9 was invoked. After caching
the entries, I stopped seeing crashes.
Rather than relying on a command, I realized it makes more sense to
explicitly set the terminal for the calling channel in MainLoop. By
doing it this way, we can also ensure that we always reset it to the
previous value.
Using the scala reflect library always introduces significant
classloading overhead. We can eliminate the classloading overhead by
generating StringTypeTags at compile time instead.
This sped up average project loading time by a few hundred milliseconds
on my computer. The ManagedLoggedReporter in zinc is still using the
type tag based apis but after the next sbt release, we can upgrade the
zinc apis. We also could consider breaking binary compatibility.
sbt depends on scalacache (which hasn't been updated in about a year)
and we really don't need the functionality provided by scalacache. In
fact, the java api is somewhat easier to work with for our use case. The
motivation is that scalacache uses slf4j for logging which meant that it
was implicitly loading log4j. This caused some noisy logs during
shutdown when the previously unused cache was initialized just to be
cleaned up.
This commit also upgrades caffeine and moving forward we can always
upgrade caffeine (and potentially shade it) without any conflict with
the scalacache version.
Upon successful registration with a FileTreeRepository, an Observable is
returned by the FileTreeRepository that can be used to observer the
specific globs that were registered. The FileTreeRepository also has a
global Observable that can be used to monitor _all_ events. In order to
implement this feature, internally the FileTreeRepository needs to hold
a reference to the registered Observable so that it forwards relevant
file change events. If we do not close the Observable, it leaks memory
inside of FileTreeRepository. There were a number of places within sbt
where we registered globs and did nothing with the returned Observable.
It was thus straightforward to fix the leak by just closing the returned
Observables.
This came up because I was looking at a heap dump of
https://github.com/jtjeferreira/sbt-multi-module-sample after running
1000 no-op compiles and noticed that the FileTreeRepository.observables
were taking up 75MB out of a total heap of about 300MB.
As a side note, it would be nice if sbt had a warning for unused return
values when a statement is not the last in a block. It's possible that
these leaks wouldn't have happened if we were forced to handle the
returned Observables.
Ref https://github.com/sbt/zinc/pull/744
This implements `ThisBuild / usePipelining`, which configures subproject pipelining available from Zinc 1.4.0.
The basic idea is to start subproject compilation as soon as pickle JARs (early output) becomes available. This is in part enabled by Scala compiler's new flags `-Ypickle-java` and `-Ypickle-write`.
The other part of magic is the use of `Def.promise`:
```
earlyOutputPing := Def.promise[Boolean],
```
This notifies `compileEarly` task, which to the rest of the tasks would look like a normal task but in fact it is promise-blocked. In other words, without calling full `compile` task together, `compileEarly` will never return, forever waiting for the `earlyOutputPing`.
It can easily take 2ms or more to parse a command depending on state's
combined parser. There are some commands that sbt requires to work that
we can handle in microseconds instead of milliseconds by special casing
them.
After this change, I saw the performance of
https://github.com/eatkins/scala-build-watch-performance improve by
a consistent 4-5ms in the 3 source file example which was a drop from
120ms to 115ms. While not necessarily earth shattering, this difference
could theoretically be much worse in other projects that have a lot of
plugins and custom tasks/commands. I think it's worth the modest
maintenance cost.
Ref https://github.com/sbt/sbt/issues/5710
Ref https://github.com/sbt/librarymanagement/pull/339
This adds `versionScheme` setting. When set, it is included into POM, and gets picked up on the other side as an extra attribute of ModuleID. That information in turn is used to inform the eviction warning.
This should reduce the false positives associated with SemVer'ed libraries showing up in the eviction warning.
The 1.4.0 implementation of watch uses a concurrent hash map to maintain
the global watch state which manages the state for an arbitrary number
of clients. Using a mutable map is not idiomatic sbt and I found it
difficult to reason about when the map was updated. This commit reworks
the feature so that the global state is instead stored in an immutable
map that is only modified during the internal watch commands, which is
easier to reason about.
The EventsTest changes kept appearing. I'm not sure why scalafmt check
was allowing it before. My vim status bar warns me about trailing spaces
and I noticed the two in Keys.scala and removed them.
In eb688c9ecd, we started buffering output
to the remote client to reduce flickering. This was causing problems
with the output for the thin client in batch mode. With the delay, it
was possible for the client to exit before all of its output had been
displayed.
Bonus: only display aggregation error message if terminal has success
enabled (the thin client displays its own timing message so the message
in aggregation ended up being a duplicate).
It is expensive to compute the the hash of every jar on the classpath so
we can try to avoid that by using the timeWrappedStamper which only
computes the hash if the last modified time has changed.
Using the managedCached introduced an unintended performance regression
because it ensured that we always computed the hash of each jar on the
dependency classpath. The backing ReadStamps only computes the stamp if
the timestamp of the jar has changed.
I noticed that if lintUnusedKeysOnLoad := true is set, it emits a lint
warning.
As a side note, Project linting takes about 300-400ms in the sbt project
so we might want to consider disabling it by default in batch mode at
least.
The sbt project load made a number of relatively inefficient
transformations of scala collecitons. I went through and found the slow
parts during project loading and made my best attempt at fixing them.
The most significant changes I made were in places using IMap. An IMap
is more or less a wrapper around an immutable Map. It can be much faster
to construct an IMap by creating a java mutable hashmap, wrapping it a
scala Map that delegates to the underlying java hashmap (with a copy on
write if the map is modified) and constructing the IMap from the wrapped
map. It was also in many cases to parallelize some transformations
wherever the order didn't matter.
After applying all of these changes, I found that loading the sbt
project took generally between 8.5 and 9 seconds on my laptop. With
1.3.13, it hovered around 11.5 seconds. I saw a similar speedup in zinc.
The biggest specific improvement was that generating the compiled map
dropped from between 3.5-4 seconds to pretty consistently being around
1.5 seconds.
This reverts commit b1dcf031a5.
I found that b1dcf031a5 had some
unintended consequences that seemed to mess up the prompt state. The
real problem that it was trying to address was that the prompt was being
interleaved with log messages in some scenarios. There was a different
way to fix that in ProgressState that was both simpler and more
reliable.
The jline2 history file format is incompatible with jline3 and jline3
prints a very noisy warning if it detects such a file. History will also
not work with jline3 until you remove or reformat the old file.
I noticed that when reloading the build, that certain errors are logged
by sbt to System.err. These were not shown to a thin client because we
weren't forwarding System.err. This change remedies that.
System.err is handled more simply than System.out. We do not put
System.err through the progress state because generally System.err is
tends to be unbuffered. I had hesitated to add System.err to the
Terminal interface at all to give users an escape hatch but I couldn't
get project loading to work well with the thin client without it.
In db4878c786, TaskProgress became an
object which mostly made things easier to reason about. The one problem
was that it started leaking tasks with every run because the timings map
would accumulate tasks that weren't cleared. To fix this, we can clear
the timings and activeTasksMap in the TaskProgress object in the
afterAllCompleted callback. Some extra null checks needed to be added
since it's possible for the maps to not contain a previously added key
after reset has been called.
This fixes the nio/external-hooks test and also restores the performance
of the benchmarks for the latest sbt version in
https://github.com/eatkins/scala-build-watch-performance which had
regressed when the custom ExternalHooks were disabled in
7c4b01d9f7.
The main change is that it changes the ReadStamps object that is passed
into the compiler options to one that uses the unmanagedFileStampCache
and managedFileStampCache for source files and falls back to the default
stamper otherwise. This improves the performance quite significantly
since we only hash the files once. It also makes it so that the analysis
file will contain the source file stamps of the files when compilation
began, rather than when compilation ended. That is what
nio/external-hooks was testing. In the real world what could happen was
that one modified a source file during compilation but then no
incremental re-compilation would occur because after the initial
compilation completed, zinc wrote the stamp of the modified source file
in the analysis file even though it may have actually compiled a
different version of the source file.
I noticed some RejectedExecutionExceptions in travis failures of
ClientTest. This could happen if we try to write output to the
network channel after it has been closed. To avoid this problem, we can
catch RejectedExecutionExceptions and do an immediate flush if the
executor has been shutdown.
In order for sbt to function well, it needs the test interface, jansi
and forked jline jars provided by a classloader that is parent to all
other sbt classloaders. To do this for just the test interface jar, I
just checked if the top loader in the app configuration had the correct
name. Now that there are three jars, this is more complicated so I
updated the launcher to create a top loader with the method getEarlyJars
implemented and returning the three needed jars. This is a much more
scalable design.
If sbt is entered with a configuration that does not have a top loader
with the getEarlyJars method defined, then we just fall back on
constructing the default layered classlaoder from the configuration
classpath.
The motivation for this change is that I discovered that sbt immediately
crashed when I tried to run a non-snapshot version. After this change, I
verified that both snapshot and non-snapshot versions of the latest sbt
code could load with both an obsolete and up-to-date launcher.
The unprompt method will actually kill the ui thread if its running. If
we don't to this, we can get into a weird state where after watch is
exited by <enter>, the ask user thread spins up but before it can print
the prompt, the terminal prompt is changed to Running, which has an
empty prompt. The end result is that jline3 never displays the prompt
even though the line reader is active and reading bytes. When the user
typed <enter> a prompt would appear. They also could input a command and
it would run but it wasn't obvious what would happen since the prompt
was missing.
The build source check is evaluated at times when we can't be completely
sure that global logger is pointing at the terminal that initiated the
reload (which may be a passive watch client). To work around this, we
can inspect the exec to determine which terminal initiated the check and
write any output directly to that terminal.
This commit upgrades sbt to using jline3. The advantage to jline3 is
that it has a significantly better tab completion engine that is more
similar to what you get from zsh or fish.
The diff is bigger than I'd hoped because there are a number of
behaviors that are different in jline3 vs jline2 in how the library
consumes input streams and implements various features. I also was
unable to remove jline2 because we need it for older versions of the
scala console to work correctly with the thin client. As a result, the
changes are largely additive.
A good amount of this commit was in adding more protocol so that the
remote client can forward its jline3 terminal information to the server.
There were a number of minor changes that I made that either fixed
outstanding ui bugs from #5620 or regressions due to differences between
jline3 and jline2.
The number one thing that caused problems is that the jline3 LineReader
insists on using a NonBlockingInputStream. The implementation ofo
NonBlockingInputStream seems buggy. Moreover, sbt internally uses a
non blocking input stream for system in so jline is adding non blocking
to an already non blocking stream, which is frustrating.
A long term solution might be to consider insourcing LineReader.java
from jline3 and just adapting it to use an sbt terminal rather than
fighting with the jline3 api. This would also have the advantage of not
conflicting with other versions of jline3. Even if we don't, we may want to
shade jline3 if that is possible.
It is possible for sbt to get into a weird state when in a continuous
build when the auto reload feature is on and a source file and a build
file are changed in a small window of time. If sbt detects the source
file first, it will start running the command but then it will
autoreload when it runs the command because of the build file change.
This causes the watch to get into a broken state because it is necessary
to completely restart the watch after sbt exits.
To fix this, we can aggregate the detected events in a 100ms window. The
idea is to handle bursts of file events so we poll in 5ms increments and
as soon as no events are detected, we trigger a build.
Remote clients sometimes flicker when updating progress. This is especially
noticeable when there are two clients and one of them is running a command,
the other will tend to have some visible flickering and character ghosting.
As an experiment, I buffered calls to flush in the NetworkChannel output
stream and the artifacts went away.
Running a `~` command in a local build off the latest develop branch
will cause the build to reload even if the build sources were only
touched and not actually modified.
In the situation where sbt was started in server mode and a client is
running a `~` command and a project reload is triggered by a change to
a build source, the console terminal looks like
sbt:foo>
[info] received remote command: ~compile
sbt:foo>
[info] welcome to sbt 1.4.0-SNAPSHOT (Azul Systems, Inc. Java 1.8.0_252)
sbt:foo>
[info] loading global plugins from ~/.sbt/1.0/plugins
sbt:foo>
[info] loading settings for project foo-build from metals.sbt ...
sbt:foo>
[info] loading project definition from
~/foo/project
sbt:foo>
[info] loading settings for project root from build.sbt ...
sbt:foo>
[info] loading settings for project macros from build.sbt ...
sbt:foo>
[info] loading settings for project main from build.sbt ...
sbt:foo>
[info] set current project to foo (in build file:~/foo)
sbt:foo>
This change fixes that by unprompting all channels during project
loading and reprompting them when it completes.
Linting unused keys was adding a significant overhead to sbt project
loading because Def.compiled is so slow. It was around 4 seconds in the
sbt project on my computer.
The continuous command recompiles the setting graph into a CompiledMap
data structure so that it can determine which files it needs to
transitively monitor during watch. Generating the CompiledMap can be
very slow for large projects (5 seconds or so on my computer in the sbt
project) and this startup cost is paid every time the user enters a
watch with `~`. To avoid this, we can cache the compile map that is
generated during the initial settings evaluation.
The only real drawback I can see is that the compiled map is guaranteed
to remain in memory so long as the BuildStructure instance that holds it
is alive. Given the performance benefit, this seems like a worthwhile
tradeoff.
There have been occasional failures on appveyor where an
AccessDeniedException was thrown at this point. AccessDeniedExceptions
thrown during scripted tests can often by resolved with a Retry.
Reboot is a bit tricky for the remote client because the sbt server is
actually shut down during reboot. When sbt shuts down the client, it can
notify the client that the reason is a reboot. The client can then
connect to the recently introduced boot control socket to display the
reboot output and supply input in case the build fails to load. Once the
server has brought back up the server, the client can reconnect. When
the client session is interactive, we're done once we reconnect. When
it's a batch session, the client needs to resend the remaing commands
that have submitted that it hasn't yet run.
Shutdown was being handled as a special case in CommandExchange. This
promotes it to a full fledged command. Also replace instance of
hard-coded strings with constants.
On windows, it is sometimes possible to leak an sbt process if two
processes are started simultaneously by a remote client at the same
time. When this happens, the second process is unable to create a
server because of the first process and it also has no io streams
because the the client detaches its streams. We can detect this
in the shell command and prevent the process from persisting as a
zombie.
When a remote client sent the command `shutdown` through the shell, the
client would log an error and exit with a nonzero exit code because
before shutting down, the server would notify the client that it was
disconnecting it due to shutdown. In this scenario, we actually do not
want the client to log an error since they initiated the shutdown, so
before doing the full shutdown, we shutdown the client that inititated
the shutdown with the flag that tells the client not to log the shutdown
or return a nonzero exit code.
One issue with the remote client approach is that it is possible for
multiple clients to start multiple servers concurrently. I encountered
this in testing where in one tmux pane I'd start an sbt server and in
another I might run sbtc before the server had finished loading. This
can actually cause java processes to leak because the second process is
unable to start a server but it doesn't necessarily die after the client
that spawned it exits. This commit prevents this scenario by creating a
server socket before it loads the build and closes once the build is
complete. The client can then receive output bytes and forward input to
the booting server.
The socket that is created during boot is always a local socket, either
a UnixDomainServerSocket or a Win32NamedPipeServerSocket. At the moment,
I don't see any reason to support TCP. This socket also has no impact at
all on the normal sbt server that is started up after the project has
loaded.
The socket is hardcoded to be located at the relative path
project/target/$SOCK_NAME or the named pipe $SOCK_NAME where SOCK_NAME
is a farm hash of the absolute path of the project base directory. There
is no portfile json since there is no need since we don't support TCP.
After the socket is created it listens for clients to whom it relays
input to the console's input stream and relays the process output back
to the client. See the javadoc in BootServerSocket.java for further
details.
The process for forking the server is also a bit more complicated after
this change because the client will read the process output and error
streams until the socket is created and thereafter will only read output
from the socket, not the process.
This commit reworks TaskProgress so that it is a singleton object. By
using a singleton, we ensure that there is at most one progress thread
running at a time. With multiple threads, there can be flickering in the
progress reports.
This fixes https://github.com/sbt/sbt/issues/5547. There also was a bug
that the reference to the progress thread was not reset when the thread
itself exited. As a result, it was possible for progress reporting to
stop while tasks were still running. This seemed to primarily happen in
multi-project builds. It should be fixed by this change.
The existing implementation of watch did not work with the thin client.
In sbt 1.3.0, watch was changed to be a blocking command that performed
manual task evaluation. This commit makes the implementation more
similar to < 1.3.0 where watch modifies the state and after running the
user specified command(s), it enters a blocking command. The new
blocking command is very similar to the shell command.
As part of this change, I also reworked some of the internals of watch
so that a number of threads are spawned for reading file and input
events. By using background threads that write to a single event queue,
we are able to block on the file events and terminal input stream rather
than polling. After this change, the cpu utilization as measured by ps
drops from roughly 2% of a cpu to 0.
To integrate with the network client, we introduce a new UITask that is
similar to the AskUserTask but instead of reading lines and adding execs
to the command queue, it reads characters and converts them into watch
commands that we also append to the command queue.
With this new implementation, the watch task that was added in 1.3.0 no
longer works. My guess is that no one was really using it. It wasn't
documented anywhere. The motivation for the task implementation was that
it could be called within another task which would let users define a
task that monitors for file changes before running. Since this had never
been advertised and is only of limited utility anyway, I think it's fine
to break it.
I also had to disable the input-parser and symlinks tests. I'm not 100%
sure why the symlinks test was failing. It would tend to work on my
machine but fail in CI. I gave up on debugging it. The input-parser test
also fails but would be a good candidate to be moved to the client test
in the serverTestProj. At any rate, it was testing a code path that was
only exercised if the user changed the watchInputStream method which is
highly unlikely to have been done in any user builds.
The WatchSpec had become a nuisance and wasn't really preventing from
any regressions so I removed it. The scripted tests are how we test
watch.
The sbtc client can provide a ux very similar to using the sbt shell
when combined with tab completions. In fact, since some shells have a
better tab completion engine than that provided by jilne2, the
experience can be even better. To make this work, we add another entry
point to the thin client that is capable of generating completions for
an input string. It queries sbt for the completions and prints the
result to stdout, where they are consumed by the shell and fed into its
completion engine.
In addition to providing tab completions, if there is no server running
or if the user is completing `runMain`, `testOnly` or `testQuick`, the
thin client will prompt the user to ask if they would like to start an
sbt server or if they would like to compile to generate the main class
or test names. Neither powershell nor zsh support forwarding input to
the tab completion script. Zsh will print output to stderr so we
opportunistically start the server or complete the test class names.
Powershell does not print completion output at all, so we do not start a
server or fill completions in that case*. For fish and bash, we prompt
the user that they can take these actions so that they can avoid the
expensive operation if desired.
* Powershell users can set the environment variable SBTC_AUTO_COMPLETE
if they want to automatically start a server of compile for run and test
names. No output will be displayed so there can be a long latency
between pressing <tab> and seeing completion results if this variable is
set.
This commit adds the ability for sbt to automatically shut itself down
if it has been idle for some duration of time. The motivation is that
if the user may not realize they have an sbt server running in the
background that is using resources. We don't want to be too aggressive
with the idle timeout because that can reduce the efficacy of the thin
client. A value of one week is chosen so that users can enjoy a long
weekend and when they return to their computer, they won't have to
restart sbt. If they haven't used the server in at least a week, it
seems prudent to just kill it.
The sbtipcsocket by default restricts win32 named pipes to only allow
connections from the same login session. This makes connecting to a
remote server not work over ssh. We relax the default slightly in sbt to
allow the owner of the pipe to connect over any logon shell. The user
could restore the old behavior with:
```
Global / windowsServerSecurityLevel := Win32SecurityLevel.LOGON_DACL
```
or, if YOLO
```
Global / windowsServerSecurityLevel := Win32SecurityLevel.NO_SECURITY
```
When we start sbt with the thin client, we want to close the server io
streams after it loads so that the client exiting won't crash the
server. When we are running the server as part of the server tests, it
is nice to have the server output. By setting the --close-io-streams
flag when we launch the server in the client, we are able to achieve
both.
Running multi commands (input commands delimited by semi-colons) did not
work with the thin client. The commands would actually run on the
server, but the thin client would exit immediately without displaying
the output. The reason was that MainLoop would report the exec complete
when all it had done was split the original command into its constituent
parts and prepended them to the state command list. To work around this,
when we detect a network source command, we can remap its exec id to a
different id and only report the original exec id after the commands
complete. We also have to keep track of whether or not the command
succeeded or failed so that the reporting command reports the correct
result.
The way its implemented is with the the following steps:
1. set the terminal to the network terminal
2. stash the current onFailure so that we can properly report failures
3. add the new exec id to a map of the original exec id to the generated
id
4. actually run the command
5. if the command succeeds, add the original exec id to a result map
6. pop the onFailure
7. restore the terminal to console
8. report the result -- if the original exec id is in the result map we
report success. Otherwise we report failure.
There is also logic in NetworkChannel for finding the original exec id
if reporting one of the artificially generated exec ids because the
client will not be aware of that id.
When the user presses ctrl+c, we want to cancel any running tasks that
were initiated by that client. This is a bit tricky because we may not
be sure what is running if the client is in interactive mode. To work
around this, we send a cancellation request with the special id
__CancelAll. When the NetworkChannel receives this request, it cancels
the active task if was initiated by the client that sent the
cancellation request. The result it returns to the client indicates if
there were any tasks to be cancelled. If there were and the client was
in interactive mode, we do not exit. Otherwise we exit.
This commit makes it possible for a remote client to cancel a running
task initiated by a different client by typing `cancel` into the shell.
It can be useful if the remote client has run something blocking like
console.
The console task can't safely be interrupted, so instead we write some
newlines filed by ctrl+d to exit the console.
This commit makes it possible for the sbt server to render the same ui
to multiple clients. The network client ui should look nearly identical
to the console ui except for the log messages about the experimental
client.
The way that it works is that it associates a ui thread with each
terminal. Whenever a command starts or completes, callbacks are invoked
on the various channels to update their ui state. For example, if there
are two clients and one of them runs compile, then the prompt is changed
from AskUser to Running for the terminal that initiated the command
while the other client remains in the AskUser state. Whenever the client
changes uses ui states, the existing thread is terminated if it is
running and a new thread is begun.
The UITask formalizes this process. It is based on the AskUser class
from older versions of sbt. In fact, there is an AskUserTask which is
very similar. It uses jline to read input from the terminal (which could
be a network terminal). When it gets a line, it submits it to the
CommandExchange and exits. Once the next command is run (which may or
may not be the command it submitted), the ui state will be reset.
The debug, info, warn and error commands should work with the multi
client ui. When run, they set the log level globally, not just for the
client that set the level.
This commit adds support for remote clients to connect to the sbt server
and attach themselves as a virtual terminal. In order to make this work,
each connection must send a json rpc request to attach to the server.
When this is received, the server will periodically query the remote
client to get the terminal properties and capabilities that allow the
remote client to act as a jline terminal proxy. There is also support
for json messages with ids sbt/systemIn and sbt/systemOut that allow io
to be relayed from the remote terminal to the sbt server and back.
Certain commands such as `exit` should be evaluated immediately. To make
this work, we add the concept of a MaintenanceTask. The CommandExchange
has a background thread that reads MaintenanceTasks and evaluates them
on demand. This allows maintenance tasks to be evaluated even when sbt
is evaluating an exec. If it weren't done this way, when the user typed
exit while a different remote connection was running a command, they
wouldn't be able to exit until the command completed.
The ServerIntents in ServerHandler did not handle
JsonRpcResponseMessage because prior to this commit, sbt clients were
primarily making requests to the server. But now the server sends
requests to the client for the terminal properties and terminal
capabilities so it was necessary to add an onResponse handler to
ServerIntent.
I had to move the network channel publishBytes method to run on a
background thread because there were scenarios in which the client
socket would get blocked because the server was trying to write on the
same thread that the read the bytes from the client.
To make the console command work, it is necessary to hijack the
classloader for JLine. In MetaBuildLoader, we put a custom forked JLine
that has a setter for the TerminalFactory singleton. This allows us to
change the terminal that is used by JLine in ConsoleReader. Without this
hack, the scala console would not work for remote clients.
In order to support a multi-client sbt server ux, we need to factor
`Terminal` out into a class instead of a singleton. Each terminal provides
and outputstream and inputstream. In all of the places where we were
previously relying on the `Terminal` singleton we need to update the
code to use `Terminal.get`, which will redirect io to the terminal whose
command is currently running.
This commit does not implement the server side ui for network clients.
It is just preparatory work for the multi-client ui.
The Terminal implementations have thread safe access to the output
stream. For this reason, I had to remove the sychronization on the
ConsoleOut lockObject. There were code paths that led to deadlock when
synchronizing on the lockObject.
Rather than going through the console appender logging to make
TaskProgress work, we can instead use the CommandExchange. This will be
useful in future commits where there are multiple terminals that all
need to receive progress. By organizing the TaskProgress this way, we
can store a separate progress state for each terminal and update the
progress for all of the active terminals. We also can set the current
running command in command exchange which will be useful in future
commits to show what command is currently running.
This commit also reworks TaskProgress to always kill its thread when
there are no active tasks. It will start a new thread as soon as there
is another active task.
We had similar code for reading json frames from an input stream in
NetworkChannel and ServerConnection. I reworked and consolidated this
logic into a shared method in ReadJsonFromInputStream.
This commit also removes the ObjectMessage reporting methods that
weren't doing anything.
The collectAnalysis task an be a bit slow and delays client connections
from running commands. This commit adds an option to skip the analysis
if it isn't needed. The default behavior is left as it was.
In Load.scala and Defaults.scala, the AppConfiguration.baseDirectory is
dealiased when it is a symlink. This commit dealiases the
AppConfiguration.baseDirectory if it is a symlink so that sbt
`appConfiguration.value.baseDirectory` should be the same as
`baseDirectory.value`.
Rather than enumerate all of the watch keys that may appear unused
though they can be used by the `~` command, rework lintUnused to take a
function `String => Boolean` instead of `Set[String] => Boolean`.
The AppConfiguration.baseDirectory is dealiased during project loading.
Not dealiasing the symlink here could cause a discrepancy between the
`baseDirectory` key and the value of the base key in the root paths map.
In global bspWorkspace setting, retrieve all projects and all configurations that contain the bspTargetIdentifier setting, so that:
- the IntegrationTest configuration, when added to a project, is automatically associated to a BSP target
- a custom configuration that contains the `Defaults.configSettings` is also associated to a BSP target
Try parse the required semanticdbVersion in the initialization request metadata
Issue a warning if the semanticdb plugin is not enabled
Issue a warning if the semanticdb version is lower than the required
This adds `Def.promise` a facility that wraps `scala.concurrent.Promise`. Project layer, there's an implicit for task-that-returns-promise (`Def.Initialize[Task[PromiseWrap[A]]]`) that would inject `await` method, which returns a task. This is a special task that is tagged with `Tags.Sentinel` so that it will bypass the concurrent restrictions. Since there's no CPU- or IO-bound work, this should be ok.
The purpose of this promise for long-running task to communicate with another task midway.
This adds `pushRemoteCache`, `pushRemoteCacheTo`, `pullRemoteCache`, etc to implement cached compilation facility.
In addition, the analysis file location is now made more clear.
Initial draft for bsp support.
This shows two communication pattern around BSP.
First, if the request can be handled with the build knowledge is readily available in `NetworkChannel` we can reply immediately. `BuildServerImpl#onBspBuildTargets` is an example for that.
Second, if the request requires `State`, then we can forward the parameter into a custom command, and reply back from a command. `BuildServerProtocol.bspBuildTargetSources` is an example of that since it needs to invoke tasks to generate sources.
java.util.ServiceLoader uses findResources(), which was not
overriden in ReverseLookupClassLoader, causing resources available
in the descendant classloader not to be discovered when a service
loader instance was using the top classloader.
Having the progress reports directly generated in beforeWork delays the
task from being submitted to the executor. This commit moves all of the
reporting onto the background thread to avoid these delays since
progress is less important than task evaluation.
The old implementation of checkBuildSources can easily take 20ms to run
when called in MainLoop.processCommand. It is rarely faster than 4-5ms.
To reduce this overhead, I stopped using the checkBuildSources task in
processCommand. Instead, I manually cache the build source hashes in a
global state variable and add a file monitor that invalidate the entire
set of source hashes if any changes are detected. This could probably be
more efficient, but I figure that build sources change infrequently
enough that it's fine to just invalidate the entire list of source
hashes.
Because the CheckBuildSources instance is already watching the meta
build, I reworked Continuous to use that FileTreeRepository for the
build sources if it is available.
Bonus: fixes https://github.com/sbt/sbt/issues/5482
I was surprised to find this method in a flamegraph* so I optimized it.
TaskProgress was actually on the hotpath of task evaluation so every
single task was slower to be enqueued with the CompletionService. After
this change, containsSkipTasks dropped out of the flamegraph.
*The flamegraph was for a compile loop where sbt constantly modified a
single source file and re-compiled it. The containsSkipTasks method
appeared in over 2% of the method calls.
As stated in #5492 and #2366 some artifact hosting services (at least
Azure Artifacts, seems to affect GCP as well) do not offer a stable HTTP
Auth realm that can be safely set in SBT credentials, hence the need to
support null or empty Credentials.
The Ivy (publishing) side of the issue was fixed in sbt/ivy#36
As to the resolving side, This commit is only part of the solution as it
just prevents an NPE and does not consider if coursier itself will fall
back to finding credentials with a null realm that matches the server
hostname.
Related-to: #5492
Related-to: #2366
Signed-off-by: Erwan Queffelec <erwan.queffelec@gmail.com>
In order to make supershell work with println, this commit introduces a
virtual System.out to sbt. While sbt is running, we override the default
java.lang.System.out, java.lang.System.in, scala.Console.out and
scala.Console.in (unless the property `sbt.io.virtual` is set to
something other than true). When using virtual io, we buffer all of the
bytes that are written to System.out and Console.out until flush is
called. When flushing the output, we check if there are any progress
lines. If so, we interleave them with the new lines to print.
The flushing happens on a background thread so it should hopefully not
impede task progress.
This commit also adds logic for handling progress when the cursor is not
all the way to the left. We now track all of the bytes that have been
written since the last new line. Supershell will then calculate the
cursor position from those bytes* and move the cursor back to the
correct position. The motivation for this was to make the run command
work with supershell even when multiple main classes were specified.
* This might not be completely reliable if the string contains ansi
cursor movement characters.
Presently if a server command comes in while in the shell, the client
output can appear on the same line as the command prompt and the command
prompt will not appear again until the user hits enter. This is a
confusing ux. For example, if I start an sbt server and type
the partial command "comp" and then start up a client and run the clean
command followed by a compile, the output looks like:
[info] sbt server started at local:///Users/ethanatkins/.sbt/1.0/server/51cfad3281b3a8a1820a/sock
sbt:scala-compile> comp[info] new client connected: network-1
[success] Total time: 0 s, completed Dec 12, 2019, 7:23:24 PM
[success] Total time: 0 s, completed Dec 12, 2019, 7:23:27 PM
[success] Total time: 2 s, completed Dec 12, 2019, 7:23:31 PM
Now, if I type "ile\n", I get:
[info] sbt server started at local:///Users/ethanatkins/.sbt/1.0/server/51cfad3281b3a8a1820a/sock
ile
[success] Total time: 0 s, completed Dec 12, 2019, 7:23:34 PM
sbt:scala-compile>
Following the same set of inputs after this change, I get:
[info] sbt server started at local:///Users/ethanatkins/.sbt/1.0/server/51cfad3281b3a8a1820a/sock
sbt:scala-compile> comp
[info] new client connected: network-1
[success] Total time: 0 s, completed Dec 12, 2019, 7:25:58 PM
sbt:scala-compile> comp
[success] Total time: 0 s, completed Dec 12, 2019, 7:26:14 PM
sbt:scala-compile> comp
[success] Total time: 1 s, completed Dec 12, 2019, 7:26:17 PM
sbt:scala-compile> compile
[success] Total time: 0 s, completed Dec 12, 2019, 7:26:19 PM
sbt:scala-compile>
To implement this change, I added the redraw() method to LineReader
which is a wrapper around ConsoleReader.drawLine; ConsoleReader.flush().
We invoke LineReader.redraw whenever the ConsoleChannel receives a
ConsolePromptEvent and there is a running thread.
To prevent log lines from being appended to the prompt line, in the
CommandExchange we print a newline character whenever a new command is
received from the network or a network client connects and we believe
that there is an active prompt.
Prior to this change, if a network command came in, it would run in the
background with no real feedback in the server ui. Prior to this change,
running compile from the thin client would look like:
sbt:scala-compile>
[success] Total time: 1 s, completed Dec 12, 2019, 7:24:43 PM
sbt:scala-compile>
Now it looks like:
sbt:scala-compile>
[info] Running remote command: compile
[success] Total time: 1 s, completed Dec 12, 2019, 7:26:17 PM
sbt:scala-compile>
It's a bit annoying to have to hit enter here.
Also, this should fix https://github.com/sbt/sbt/issues/5162 because if
there is no System.in attached, the read will return -1 which will cause
sbt to quit.
There typically are fewer than 10 main classes in a project so allow the
user to just input a single digit in those cases. Otherwise fallback to
a line reader.
When the user inputs `run` or `runMain` we shouldn't print the warning
about multiple classes because in the case of run they already will be
prompted to select the classes and in the case of runMain, they are
already required to specify the class name.
Bonus:
* improve punctuation
* add clear screen to selector dialogue
* print selector dialogue in one call to println -- this should prevent
the possibility of messages from other threads being interlaced with
the dialogue
This commit aims to centralize all of the terminal interactions
throughout sbt. It also seeks to hide the jline implementation details
and only expose the apis that sbt needs for interacting with the
terminal.
In general, we should be able to assume that the terminal is in
canonical (line buffered) mode with echo enabled. To switch to raw mode
or to enable/disable echo, there are apis: Terminal.withRawSystemIn and
Terminal.withEcho that take a thunk as parameter to ensure that the
terminal is reset back to the canonical mode afterwards.
When trying to use any jdk > 8 with the latest sbt, sbt will die in some
projects because it tries to call Locate.defineClass on rt.jar, which
is represented with a DummyVirtualFile and causes a crash.
Fixes https://github.com/sbt/sbt/issues/3112
This unpacks Extracted as State's extension methods.
In addition this provides a way of responding via LSP.
csrResolvers.all evaluates all possible scopes in arbitrary order. This change
makes sure at least the project resolvers are placed before any resolvers from
dependency projects.
When sbt was entered through xMain.run and the classloaders do not have
the expected format, sbt recreates the classloaders for itself.
Unfortunately the extra classpath was not added to the classloader. This
caused project/extra to fail if it was entered from RunFromSourceMain
rather than with the launcher.
The main reason for having both the RunFromSourceMain and LauncherBased
scripted tests was that RunFromSourceMain would fail for any test that
ended up accessing the sbt.Package$ object. This commit fixes this bug
by reworking the classloader generated by RunFromSourceMain to invoke
sbt, switching from the classpath to jar classpath (by setting exportJars =
true) and entering sbt by calling `new xMain().run` rather than
`xMain.run`.
The reason for switching to the jar classpath is that the jvm seems to
have issues when there are two classes provided in different directories
that have the same case insensitive name, e.g. `sbt.package$` and
`sbt.Package$`. If those classes are instead provided in different
jars, the jvm seems to be able to handle it.
Exporting the jars is not enough though, I had to rework the
ClassLoader created in the launch method to have a layout that was
recognized by xMainConfiguration. I reimplemented the AppConfiguration
in java so that it could bootstrap itself in a single jar classloader
(the only needed jar is the Scripted.
If we export the jars in the build, then the NoClassDefErrors for
`sbt.Package$` go away during scripted tests using RunSourceFromMain.
This might make running tests in subprojects slightly slower but I think
its a worthy tradeoff.
In order for the sbt launcher to be able to resolve a local version of
sbt, we must publish the main jar, the sources jar, the doc jar, the pom
and an ivy.xml file. The publish and publishLocal tasks are wired in
IvyXml.scala to create an ivy.xml file before running publish. This
wasn't done with publishLocalBin which made it not work when no ivy.xml
file was already present (which was the case after running clean).
either create test reports with legacy file names (legacyreport=true) or with standard file names (legacyreport=false or omitted) but not both as suggested in #4451
Fixes https://github.com/sbt/sbt/issues/5339
It seems like some tests are using `ClassLoader#getResource("")` to acquire the `classes` directory path. This does not seem to work on sbt 1.3.6, which returns `file:/home/travis/.cache/coursier/v1/https/repo1.maven.org/maven2/org/apache/logging/log4j/log4j-api/2.11.2/log4j-api-2.11.2.jar!/META-INF/versions/9/`. To workaround this issue, I've switched to loading the known folder name instead.
In 53788ba356, I changed the cross multi
parser to issue all of the commands sequentially. This caused a
performance regression for many use cases:
https://github.com/sbt/sbt/issues/5321. This commit restores the old
behavior of `+` if the command to run has no arguments.
Ref https://github.com/sbt/sbt/issues/5314
Ref https://github.com/sbt/sbt/pull/5265
In sbt 1.3.4, it's possible to define a subproject named `client`.
The current parser behaves differently whether we calll `client/clean` or `client / clean` with whitespaces. The one with the whitespace invokes `client` command (as in thin client). This gets triggered by `+clean` because the new implementation uses whitespace.
There have been a number of issues that have come up because of sbt
1.3.0 aggressively closing classloaders. While these issues have been
quite useful in helping us determine some issues related to classloader
lifecycle, we should give users the option to prevent sbt from closing
the classloaders.
I also noticed that the classloader-cache/spark test has been
occasionally segfaulting on travis so I disable classloader closing in
that test.
Fixes https://github.com/sbt/sbt/issues/5063
This fixes "sbt new" on Ubuntu by restoring the terminal state after supershell querying for the terminal width.
sbt should not by default create files in the location specified by
java.io.tmpdir (which is the default behavior of apis like
IO.createTemporaryDirectory or Files.createTempFile) because they have a
tendency to leak and it also isn't even guaranteed that the user has
write permissions there (though this is unlikely). Doing so creates the
possibility for leaks
I git grepped for `createTemp` and found these apis. After this change,
the files created by sbt should largely be localized to the project and
sbt global base directories.
Closing the ManagedClassLoader generated by test can cause nonlocal
effects because the jdk shares some JarFile resources across multiple
URLClassLoaders. As a result, if one classloader is trying to load a
resource and the classloader is closed, it might cause the resource
loading to fail (see https://github.com/sbt/sbt/issues/5262). This can
be fixed by moving the scalatest framework jar (and its dependencies)
into an additional classloader layer that sits between the scala library
loader and the rest of the test dependencies.
In addition to adding the new layer, I reworked the
ReverseLookupClassLoader to use its dependent classloader to find
resources that may below it in the classloading hierarchy rather than
constructing an entirely new classloader for resources.
After this change, I was able to run test in the repro project:
https://github.com/rjmac/sbt-5262 1000 times with no failures. Note that
the repro is sensitive to the jdk used. I could not reproduce with jdk11
but I could typically induce a failure within 20 or so runs with jdk8.
I benchmarked this change with
https://github.com/eatkins/scala-build-watch-performance and performance
was roughly the same as 1.3.4 with turbo mode and about 200-250ms faster
in non-turbo mode (which can be explained by the time to load the
scalatest classes).
This makes it possible to do mkIvyConfiguration.value.withXXX(...) for
all the methods in InlineIvyConfiguration. (I need this to remove
inter-project resolvers when fetching dotty from sbt-dotty to avoid
accidentally fetching a local project in the build of dotty itself).
The previous implementation of ZombieClassLoader was not thread safe.
This caused problems because it is possible for the ManagedClassLoader
in test to leak into the coursier thread pool if the test uses bouncy
castle apis. Unfortunately, these apis seem to in some cases assign
static variables using the Thread context class loader. Because the
bouncycastle apis are implemented by the jdk, they are found in the
system classloader and thus the static references leak out of the test
context.
I had a local repro of https://github.com/sbt/sbt/issues/5249 that is
fixed by this change.
I found it hard to reason about where certain local variables, like
currentRef, were coming from. I also changed 'x' to 'extracted' in a few
places for clarity as well.
Rather than putting the background job temporary files in whatever
java.io.tmpdir points to, this commit moves the files into a
subdirectory of target in the project root directory.
To make the directory configurable via settings, I had to move the
declaration of the bgJobService setting later in the project
initialization process. I don't think this should matter because
background jobs shouldn't be created until after the project has loaded
all of its settings..
When a user calls sbt exit and there is an active background job, sbt
may not exit cleanly. This was primarily because the
background job service shutdown method depended on the
StandardMain.executionContext which was closed before the background job
service was shutdown. This was fixable by reordering the resource
closing in StandardMain.runManaged.
When running a main method, if the user inputs ctrl+c then the `run`
task will exit but the main method is not interrupted so it continues
running even once sbt has returned to the shell. If the main method is a
webserver, this prevents run from ever starting again on a fixed port.
To fix this, we can modify the waitForTry method to stop the job if an
exception is thrown (ctrl+c leads to an interrupted exception being
thrown by waitFor).
I rework the BackgroundJobService so that the default implementation of
waitForTry is now usable and no longer needs to be overridden. The side
effect of this change is that waitFor may now throw an exception. Within
sbt, waitFor was only used in one place and I reworked it to use
waitForTry instead. This could theoretically break a downstream user who
relied on waitFor not throwing an exception but I suspect that there
aren't many users of this api, if any at all.
The StateTransform class introduced in
9cdeb7120e did not cleanly integrate with
logic for transforming the state using the `transformState` task
attribute. For one thing, the state transform was only applied if the
root node returned StateTransform but no transformation would occur if a
task had a dependency that returned StateTransform. Moreover, the
transformation would override all other transformations that may have
occurred during task evaluation.
This commit updates StateTransform to act very similarly to the
transformState attribute. Instead of wrapping a `State` instance, it now
wraps a transformation function from `State => State`. This function
can be applied on top of or prior to the other transformations via the
`transformState`.
For binary compatibility with 1.3.0, I had to add the stateProxy
function as a constructor parameter in order to implement the `state`
method. The proxy function will generally throw an exception unless the
legacy api is used. To avoid that, I private[sbt]'d the legacy api so
that binary compatibility is preserved but any builds targeting > 1.4.x
will be required to use the new api.
Unfortunately I couldn't private[sbt] StateTransform.apply(state: State)
because mima interpreted it as a method type change becuase I added
StateTransform.apply(transform: State => State). This may be a mima bug
given that StateTransform.apply(state: State) would be jvm public even
when private[sbt], but I figured it was quite unlikely that any users
were using this method anyway since it was incorrectly implemented in
1.3.0 to return `state` instead of `new StateTransform(state)`.
The linting can take a while for large projects because `Def.compiled`
scales in the number of settings. Even for small projects (i.e. scripted
tests), it takes about 50 ms on my computer. This doesn't change the
current behavior because the default value is true.
This PR includes the values of the `description` and `homepage`
settings into the `ivy.xml` files generated by the `makeIvyXml`
task. It restores the behaviour of sbt 1.2.8 and if `useCoursier`
is set to `false`.
Two things are changed in this PR:
* `IvyXml.content` now adds the `homepage` attribute to the
`description` element if `project.info.homePage` is not empty.
* `CoursierInputsTasks.coursierProject0` now fills the previous
empty `CProject.info` field with the description and homepage.
Closes: #5234
Ref #4211Fixes#4395Fixes#4600
This is a reimplementation of `--addPluginSbtFile`. #4211 implemented the command to load extra `*.sbt` files as part of the global plugin subproject. That had the unwanted side effects of not working when `.sbt/1.0/plugins` directory does not exist. This changes the strategy to load the `*.sbt` files as part of the meta build.
```
$ sbt -Dsbt.global.base=/tmp/hello/global --addPluginSbtFile=/tmp/plugins/plugin.sbt
[info] Loading settings for project hello-build from plugin.sbt ...
[info] Loading project definition from /private/tmp/hello/project
sbt:hello> plugins
In file:/private/tmp/hello/
sbt.plugins.IvyPlugin: enabled in root
sbt.plugins.JvmPlugin: enabled in root
sbt.plugins.CorePlugin: enabled in root
sbt.ScriptedPlugin
sbt.plugins.SbtPlugin
sbt.plugins.SemanticdbPlugin: enabled in root
sbt.plugins.JUnitXmlReportPlugin: enabled in root
sbt.plugins.Giter8TemplatePlugin: enabled in root
sbtvimquit.VimquitPlugin: enabled in root
```
The current injection of the new nio keys will overwrite any definitions
of those keys in a build source. This is undesirable. The fix is to
create a mapping of scoped keys to settings and for each inject setting
key, if there is a previous key, put that definition after the injected
definition so that it can override it.
It is still possible for progress threads to leak so shut them down if
there are no active tasks. The report0 method will start up a new thread
if a task is added.
In some circumstances, sbt would generate a number of task progress
threads that could run concurrently. The issue was that the TaskProgress
could be shared by multiple EvaluateTaskConfigs if a dynamic task was
used. This was problematic because when a dynamic task completed, it
might call afterAllCompleted which would stop the progress thread. There
also was a race condition because multiple threads calling initial could
theoretically have created a new progress thread which would cause a
resource leak.
To fix this, we modify the shared task progress so that the `stop()`
method is a no-op. This should prevent dynamic tasks from stopping the
progress thread. We also defer the creation of the task thread until
there is at least one active task. This prevents a thread from being
created in the shell.
The motivation for this change was that I found that sometimes there was
a leaked progress thread that would make the shell not really work for
me because the progress thread would overwrite the shell prompt. This
change fixes that behavior and I was able to validate with jstack that
there was consistently either one or zero task progress threads at a
time (zero in the shell, one when tasks were actually running).
When running sbt -Dtask.timings=true, the task timings get printed to
the console which can overwrite the shell prompt. When we use a logger,
the timing lines are correctly separated from the prompt lines.
The completions were generating page numbers that didn't make sense if
there were a small number of scripted tests. For example, suppose that
there were only two tests defined, it would generate *1of3 *2of3 and
*3of3 completions even though there weren't even three tests.
The way clean was implemented, it was running `clean`, `ivyModule` and
`streams` concurrently. This was problematic because clean could blow
away files needed by `ivyModule` and `streams`. To fix this, move the
cleanCachedResolutionCache into a separate task and run that before the
normal clean.
Should fix https://github.com/sbt/sbt/issues/5067.
Fixes https://github.com/sbt/sbt/issues/3183
This implements an input task lintBuild that checks for unused settings/tasks.
Because most settings are on the intermediary to other settings/tasks, they are included into the linting by default. The notable exceptions are settings used exclusively by a command. To opt-out, you can either append it to `Global / excludeLintKeys` or set the rank to invisible.
On the other hand, many tasks are on the leaf (called by human), so task keys are excluded from linting by default. However there are notable tasks that trip up users, so they are opted-in using `Global / includeLintKeys`.
I noticed that when entering the console, I'd often be left with a
supershell line at the bottom of the screen that would eventually get
interlaced with my console commands. This can be eliminated by clearing
the supershell progress before evaluating the task if it is one of the
skip tasks.
Fixes https://github.com/sbt/sbt/issues/1673
There's been report of intermittent "Could not create directory" error related to "classes.bak." retronym identified that all configurations are using the same directory, and that might be the cause of race condition.
This addresses the issue by assigning a unique directory for each configuration.
These classloaders which are created if sbt is launched with a legacy
launcher (or one that doesn't follow the current classloading hierarchy
convention), were implemented in scala, but that meant that they were
not parallel capable. I fix that by moving the implementations to java.
I also move the static method that creates a MetaBuildLoader into the
java class.
A number of users were reporting issues with deadlocking when using
1.3.2: https://github.com/sbt/sbt/issues/5116. This seems to be because
most of the sbt created classloaders were not actually parallel capable.
In order for a classloader to be registered as a parallel capable, ALL
of the parent classes except for object in the class hierarchy must be
registered as a parallel capable:
https://docs.oracle.com/javase/8/docs/api/java/lang/ClassLoader.html#registerAsParallelCapable--.
If a classloader is not registered as parallel capable, then a global
lock will be used internally for classloading and this can lead to deadlock.
It is impossible to register a scala 2 classloader as parallel capable
so I ported all of the classloaders to java.
This commit updates the java-serialization scripted test. Prior to the
port, the new version of the test would more or less always deadlock.
After this change, I haven't been able to reproduce a deadlock.
This had no significant performance impact when I reran
https://github.com/eatkins/scala-build-watch-performance
Fixes#1458
Running sbt from `/` results to sbt getting stuck trying to load the directories recursively, and eventually erroring with a java.lang.OutOfMemoryError (after freezing for a long time) even on an Alpine container.
To prevent it, this adds a check to see if the absolute path is `/` or not.
```
/ $ sbt -Dsbt.version=1.4.0-SNAPSHOT
[error] java.lang.IllegalStateException: cannot run sbt from root directory without -Dsbt.rootdir=true; see sbt/sbt#1458
[error] Use 'last' for the full log.
```
The message:
```
Project loading failed: (r)etry, (q)uit, (l)ast, or (i)gnore?
```
is not explicit about retry being the option used when pressing return.
I don't think that dummy tasks really make sense for task progress
because they are evaluated outside of the normal task evaluation. This
came up because I was seeing streams-manager in supershell which didn't
seem useful.
I noticed some flickering in super shell progress lines and realized
that it was because there were multiple progress threads running
concurrently. This is problematic because each thread has a completely
different state so if each thread has an active task, the display will
flicker between the two tasks. I think this is caused primarily by
dynamic tasks. At least the example where I was seeing it was caused by
a dynamic task.
If a project had a meta-meta build (project/project), the build sources
in the project directory were ignored. This was because the projectGlobs
method did not correctly handle recursion. It inadvertently
discarded the accumulator globs and only returned the most recently
generated globs. This commit fixes that and adds a regression test to
the nio/reload scripted test.
I was looking into sbt start up time and in profiling was able to
identify a number of classloading bottlenecks. To speed up
initialization, we can preload those classes in the background. I saw
average speedups of roughly .75 seconds after this change. Also, the `time`
command would consistently report cpu system time very close to 400% and
I have 4 cores on my laptop. With 1.3.0 it would be more like 350%.
There have been a number of complaints about the new classloader closing
behavior. It is too aggressive about closing classloaders after test and
run. This commit softens the behavior by allowing a classloader to be
resurrected after close by creating a new zombie classloader that has
the same urls as the original classloader. After this commit, we always
close the classloaders when we are done, but they can still leak
file descriptors if a zombie is created.
To configure the behavior, I add the allowZombieClassLoaders key. If it
is false (which is default), we will warn but still allow them. If it
is true, then we silence the warning. In a later version of sbt, we can
change the semantics to be strict.
I verified after this change that I could add a shutdown hook in `run`
and it would be evaluated so long as I set `bgCopyClasspath := false`.
Otherwise the needed jars were deleted before the hooks could run.
Bonus: delete unused ResourceLoaderImpl class
Fixes#5070
This adds a new setting called `includePluginResolvers` (default `false`).
When set to `true`, it the project will include resolvers from the metabuild.
This allows the build user to declare a resolver in one place (`project/plugins.sbt`) that gets applied to both the metabuild as well as all the subprojects. The scenario comes up when someone distributes a software on their own repo. Ref #4103
We want to recursively monitor the project meta build, but we also want
to avoid listing directories that don't exist. To compromise, I rework
the buildSourceFileInputs to add the nested project directories if they
exist. Because the fileInputs are a setting, this means that adding a
new project directory and *.sbt or *.scala will not immediately trigger
a rebuild, but in most common cases, it will. I added a scripted test
for this.
In sbt 1.3.0, we only monitor build sources in the root project
directory and the root project meta build directory. This commit adds
these inputs for each project.
Fixes https://github.com/sbt/sbt/issues/5061.
This was an oversight that caused consoleQuick to not work with
supershell. We should probably try to figure out a way to allow custom
tasks to black list themselves from super shell reporting.
In https://github.com/sbt/sbt/issues/5075 we realized that sbt 1.3.0
introduces a regression where it closes the classloader used to invoke
the main method for in process run before all of its non-daemon threads
have terminated. To fix this and still close the classloader, I add a
method, runInBackgroundWithLoader that provides the background job
services with an optional classloader that it can close after the job
completes.
This cleanly merges and works with 1.3.x as well.
Fixes https://github.com/sbt/sbt/issues/5047
When setting swoval.tmpdir via globalBase, changed to set globalBase as absolute path.
`com.swoval.runtime.NativeLoader.loadPackaged` uses `java.lang.System.load`.
It requires absolute path, so we should set `swoval.tmpdir` with absolute path.
During refactoring, these warnings got out of date. I also added
scaladoc to the watchTriggeredMessage key.
Ref: https://github.com/sbt/sbt/issues/5051.
To avoid reliance on jvm global variables, we need to share the super
shell state with each of the console appenders that write to the console
out. We only set the progress state for the console appenders for the
screen. This prevents messages that are below the global logging level
from modifying the progress state without preventing them from being
written to other appenders.
The ability to set the ProgressState for each of the console appenders
is added in a companion util PR.
I verified that the test output of io/test was correctly written to the
streams after this change (there were no progress lines in the output).
Disabling supershell when color mode is disabled is a sensible default
(especially for piped output). However, I think it should still be
possible to use supershell in no color mode.
This requires a util change that also enables supershell in no color
mode.
It is redundant and slow to restamp all of the dependency classpath
files when they have likely already been stamped by a subproject.
For the classfiles of subprojects, we fill the managedFileStampCache
with the values returned by the zinc compile analysis product stamps.
This is why they are probably already in the managed cache and should be
up to date so long as zinc is working correctly.
I noticed that various outputFileStamps tasks were showing up in the
task timing report when I ran Test / definedTests in the main sbt project.
That task became about 400ms faster after this change.
I noticed that the reports generated when using sbt.task.timings=true
made very little sense. They were displaying timings for tests that
couldn't possibly have been run. I tracked this down to the TaskTimings
be stored in the progressReport setting which meant they were reused
across multiple task runs. After this change, the reports made a lot
more sense.
The tab completions for scripted have long been broken. They display a
number of non-sensical pages like '*0of9' or '*1of0'. Some of the
multiparser changes seem to have caused these invalid
In 5eab9df0df, I updated the
outputFileStamps task to compute all of the stamps for a directory
recursively if an output file is a directory. Prior to that, it had only
computed the stamp for the directory itself. This caused a significant
performance regression in creating the test classloader because it was
computing the last modified time for all of the classfiles in the class path.
The test for 5000 source files in
https://github.com/eatkins/scala-build-watch-performance was running roughly
400ms slower due to this regression.
Ref https://github.com/sbt/sbt/issues/4905
This is a companion PR to https://github.com/sbt/librarymanagement/pull/318.
This will print the following warnings:
```
sbt:hello> compile
[warn] insecure HTTP request is deprecated 'Artifact(jsoup, jar, jar, None, Vector(), Some(http://jsoup.org/packages/jsoup-1.9.1.jar), Map(), None, false)'; switch to HTTPS or opt-in using from(url(...), allowInsecureProtocol = true) on ModuleID or .withAllowInsecureProtocol(true) on Artifact
[warn] insecure HTTP request is deprecated 'http://repo.typesafe.com/typesafe/releases/'; switch to HTTPS or opt-in as ("Typesafe Releases" at "http://repo.typesafe.com/typesafe/releases/").withAllowInsecureProtocol(true)
[warn] insecure HTTP request is deprecated 'http://repo.typesafe.com/typesafe/releases/'; switch to HTTPS or opt-in as ("Typesafe Releases" at "http://repo.typesafe.com/typesafe/releases/").withAllowInsecureProtocol(true)
[warn] insecure HTTP request is deprecated 'http://repo.typesafe.com/typesafe/releases/'; switch to HTTPS or opt-in as ("Typesafe Releases" at "http://repo.typesafe.com/typesafe/releases/").withAllowInsecureProtocol(true)
[warn] insecure HTTP request is deprecated 'Patterns(ivyPatterns=Vector(), artifactPatterns=Vector(http://repo.typesafe.com/typesafe/releases/[organisation]/[module](_[scalaVersion])(_[sbtVersion])/[revision]/[artifact]-[revision](-[classifier]).[ext]), isMavenCompatible=true, descriptorOptional=false, skipConsistencyCheck=false)'; switch to HTTPS or opt-in as Resolver.url("Typesafe Ivy Releases", url(...)).withAllowInsecureProtocol(true)
[warn] insecure HTTP request is deprecated 'Patterns(ivyPatterns=Vector(), artifactPatterns=Vector(http://repo.typesafe.com/typesafe/releases/[organisation]/[module](_[scalaVersion])(_[sbtVersion])/[revision]/[artifact]-[revision](-[classifier]).[ext]), isMavenCompatible=true, descriptorOptional=false, skipConsistencyCheck=false)'; switch to HTTPS or opt-in as Resolver.url("Typesafe Ivy Releases", url(...)).withAllowInsecureProtocol(true)
[warn] insecure HTTP request is deprecated 'Patterns(ivyPatterns=Vector(), artifactPatterns=Vector(http://repo.typesafe.com/typesafe/releases/[organisation]/[module](_[scalaVersion])(_[sbtVersion])/[revision]/[artifact]-[revision](-[classifier]).[ext]), isMavenCompatible=true, descriptorOptional=false, skipConsistencyCheck=false)'; switch to HTTPS or opt-in as Resolver.url("Typesafe Ivy Releases", url(...)).withAllowInsecureProtocol(true)
```
I inadvertently changed the semantics of clean so that cleanFiles would
only delete the file if it was a regular file. In older versions of sbt,
if a file in cleanFiles was a directory, it would be recursively
deleted.
During refactoring of Continuous, I inadvertently changed the semantics
of `~` so that all multi commands were run regardless of whether or not
an earlier command had failed. I fixed the issue and added a regression
test.
It was reported in https://github.com/sbt/sbt/issues/4973 that the
scalaVersion setting was not being correctly set in a script running
with ScriptMain using 1.3.0-RC4. Using git bisect, I found that the
issue was introduced in
73cfd7c8bd.
That commit manipulates the classloaders passed in by the launcher, but
only for the xMain entry point. I found that the script ran correctly if
I updated the classloader for ScriptedMain as well.
After these changes, the example script in #4973 correctly prints 2.13.0
for the scala version with a locally published sbt.
Bonus: rename xMainImpl object xMain. It was private[sbt] anyway.
https://github.com/sbt/sbt/issues/4986 reported that +compile would
always recompile everything in the project even when the sources hadn't
changed. This was because the dependency classpath was changing between
calls to compile, which caused the external hooks cache introduced in
32a6d0d5d7 to invalidate the scala
library. To fix this, I cache the file stamps on a per scala version
basis. I added a scripted test that checks that there is no
recompilation in two consecutive calls to `+compile` in a multi scala
version build. It failed prior to these changes.
I looked for serial bottlenecks in sbt project loading and discovered
that KeyIndex.aggregate was relatively easily parallelizable. Before
this time, it took about 1 second to run KeyIndex.aggregate in the akka
project on my computer. After this change, it took 250ms. Given that I
have 4 logical cores, the speedup is roughly linear.
It turns out that injecting the keys necessary for incremental tasks
causes a significant startup penalty for many larger projects. For
example, akka starts up about 3 seconds faster if do not inject these
settings for the tasks returning `File` or `Seq[File]`. Given that all
of these apis use java.nio.file anyway, it makes sense to not backport
them to older tasks.
The clean task got a lot slower in 1.3.0
(https://github.com/sbt/sbt/issues/4972). The reason for this was that
sbt 1.3.0 generated many custom clean tasks for any tasks that returned
`File` or `Seq[File]`. Each of these tasks was tagged with
Tags.Clean which meant that only one of them could run at a time. As a
result, it took a long time to evaluate all of the custom tasks, even if
they were no-ops. In the akka project, a no-op clean was taking 35
seconds which is simply unacceptable. After this change, a no-op clean
takes less than a second in akka (a full clean only takes about 6
seconds after running test:compile)
To fix this, I stopped aggregating the clean task across configs and
projects. Because I removed the aggregation, I needed to manually
implement clean in the `Compile` and `Test` configurations to make
`Compile / clean` and `Test / compile` clean work correctly.
Fixes#4964
Together with https://github.com/sbt/util/pull/211, this brings back stack trace supression for custom tasks by default.
Debug levels logs are available in `last`, and this prints a message informing the user of the fact. BLUE on dark background is difficult to read, so I am chaning the color hilight to MAGENTA.
After adding the automatic lookup to external hooks for missing binary
jars, the scripted test dependency-management/invalidate-internal
started failing. This was because the previous analysis contained a jar
dependency that still existed on disk but was no longer a part of the
dependency classpath. Fundamentally the problem is that the zinc
compile analysis is not tightly coupled with the sbt build state.
To fix this, we can cache the dependency classpath file stamps in the
same way that we cache the input file stamps in external hooks and
manually diff them at the sbt level. We then force updates regardless of
the difference between the zinc state and the sbt state.
It was reported that in community builds, sometimes there was
spurious over-compilation due to invalidation of the scala library jar
(https://github.com/sbt/sbt/issues/4948). The reason for this was that
the external hooks prefills the managed cache with all of the time
stamps for the project dependencies but was not looking up any jars that
weren't in the cache. I suspect I did this because I didn't realize that
zinc also includes its own classpath in the binaries which is not
a part of the dependencyClasspath. The fix is to just add the jar to the
cache if it doesn't already exist by switching to getOrElseUpdate from
get.
I followed the steps in #4948 and published a version of sbt locally
with this change and the spurious re-builds stopped.
In the code formatting use case, the formatting task may modify the
source files in place. If the formatting task uses the nio
inputFileStamps, then it would fill the in-memory cache of source paths
to file stamps. This would cause compile to see the pre-formatted
stamps. To fix this, we can invalidate the file cache entries for the
outputs of a task. This will cause the side-effect of some extra io
because the hashes may be computed three times: once for the format
inputs, once for the format outputs and once for the compile inputs. I
think most users would understand that adding auto-formatting would
potentially slowdown compilation.
To really prove this out, I implemented a poor man's scalafmt plugin in
a scripted test. It is fully incremental. Even in the case when some
files cannot be formatted it will update all of the files that can be
formatted and not re-format them until they change.
It makes sense to add a scope for the `fileTreeView` key where it is
available. At the moment, there is only one `fileTreeView`
implementation but, if that changes down the road, these tasks will
automatically inherit the correct view.
It makes sense for the new glob/nio based apis that we provide first
class support for filtering the results. Because it isn't possible to
scope a task within a task within a task, i.e.
`compile / fileInputs / includePathFilter`, I had to add four new
filter settings of type `PathFilter`:
fileInputIncludeFilter :== AllPassFilter.toNio,
fileInputExcludeFilter :== DirectoryFilter.toNio || HiddenFileFilter,
fileOutputIncludeFilter :== AllPassFilter.toNio,
fileOutputExcludeFilter :== NothingFilter.toNio,
Before I was effectively hard-coding the filter: RegularFileFilter &&
!HiddenFileFilter in the inputFileStamps and allInputFiles tasks. These
remain the defaults, as seen in the fileInputExcludeFilter definition
above, but can be overridden by the user.
It makes sense to exclude directories and hidden files for the input
files, but it doesn't necessarily make sense to apply any output filters
by default. For symmetry, it makes sense to have them, but they are
unlikely to be used often.
Apart from adding and defining the default values for these keys, the
only other changes I had to make was to remove the hard-coded filters
from the allInputFiles and inputFileStamps tasks and also add the
filtering to the allOutputFiles task. Because we don't automatically
calculate the FileAttributes for the output files, I added logic for
bypassing the path filter application if the PathFilter is effectively
AllPass, which is the case for the default values because:
AllPassFilter.toNio == AllPass
NothingFilter.toNio == NoPass
AllPass && !NoPass == AllPass && AllPass == AllPass
Prior to this commit, change tracking in sbt 1.3.0 was done via the
changed(Input|Output)Files tasks which were tasks returning
Option[ChangedFiles]. The ChangedFiles case class was defined in io as
case class ChangedFiles(created: Seq[Path], deleted: Seq[Path], updated: Seq[Path])
When no changes were found, or if there were no previous stamps, the
changed(Input|Output)Files tasks returned None. This made it impossible
to tell whether nothing had changed or if it was the first time.
Moreover, the api was awkward as it required pattern matching or folding
the result into a default value.
To address these limitations, I introduce the FileChanges class. It can
be generated regardless of whether or not previous file stamps were
available. The changes contains all of the created, deleted, modified
and unmodified files so that the user can directly call these methods
without having to pattern match.
It is tedious to write (foo / allInputFiles).value so I added simple
extension method macros that expand `foo.inputFiles` to
(foo / allInputFiles).value and `foo.outputFiles` to
`(foo / allOutputFiles).value`.
While writing documentation for the new file management/incremental
task evaluation features, I realized that incremental task evaluation
did not have the correct semantics. The problem was that calls to
`.previous` are not scoped within the current task. By this, I mean that
say there are tasks foo and bar and that the defintion of bar looks like
bar := {
val current = foo.value
foo.previous match {
case Some(v) if v == current => // value hasn't changed
case _ => process(current)
}
}
The problem is that foo.previous is stored in
effectively (foo / streams).value.cacheDirectory / "previous". This
means that it is completely decoupled from foo. Now, suppose that the
user runs something like:
> set foo := 1
> bar // processes the value 1
> set foo := 2
> foo
> bar // does not process the new value 2 because foo was called, which updates the previous value
This is not an unrealistic scenario and is, in fact, common if the
incremental task evaluation is changed across multiple processing steps.
For example, in the make-clone scripted test, the linkLib task processes
the outputs of the compileLib task. If compileLib is invoked separately
from linkLib, then when we next call linkLib, it might not do anything
even if there was recompilation of objects because the objects hadn't
changed since the last time we called compileLib.
To fix this, I generalizedthe previous cache so that it can be keyed on
two tasks, one is the task whose value is being stored (foo in the
example above) and the other is the task in which the stored task value
is retrieved (bar in the example above). When the two tasks are the
same, the behavior is the same as before.
Currently the previous value for foo might be stored somewhere like:
base_directory/target/streams/_global/_global/foo/previous
Now, if foo is stored with respect to bar, it might be stored in
base_directory/target/streams/_global/_global/bar/previous-dependencies/_global/_gloal/foo/previous
By storing the files this way, it is easy to remove all of the previous
values for the dependencies of a task.
In addition to changing how the files are stored on disk, we have to store
the references in memory differently. A given task can now have multiple
previous references (if, say, two tasks bar and baz both depend on the
previous value). When we complete the results, we first have to collect
all of the successful tasks. Then for each successful task, we find all
of its references. For each of the references, we only complete the
value if the scope in which the task value is used is successful.
In the actual implemenation in Previous.scala, there are a number places
where we have to cast to ScopedKey[Task[Any]]. This is due to
limitations of ScopedKey and Task being type invariant. These casts are
all safe because we never try to get the value of anything, we only use
the portion of the apis of these types that are independent of the value
type. Structural typing where ScopedKey[Task[_]] gets inferred to
ScopedKey[Task[x]] forSome x is a big part of why we have problems with
type inference.
Using List instead of vector makes the code a bit more readable. We
don't need indexed access into the data structure so its unlikely that
Vector was providing any performance benefit.
As part of a documentation push, I noticed that these were undocumented
and that there were some public apis in FileStamp that I intended to be
private[sbt].
I realized it was probably not ideal to have these implicit JsonFormats
defined directly in the FileStamp object because they might
inadvertently be brought into scope with a wildcard import.
I noticed that sometimes if I changed a build source and then ran reload
in the shell, I'd still see a warning about build sources having
changed. We can eliminate this behavior by resetting the
hasCheckedMetaBuild state attribute to false and skipping the
checkBuildSources step if the current command is 'reload'. We also now
skip checking the build source step if the command is exit or reboot.
Zinc records all of the compile source file hashes when compilation
completes. This is problematic because its possible that a source file
was changed during compilation. From the user perspective, this may mean
that their source change will not be recompiled even if a build is
triggered by the change.
To overcome this, I add logic in the sbt provided external hooks to
override the zinc analysis stamps. This is done by writing the source
file stamps to the previous cache after compilation completes. This
allows us to see the source differences from sbt's perspective, rather
than zinc's perspective. We then merge the combined differences in the
actual implementation of ExternalHooks. In some cases this may result in
over-compilation but generally over-compilation is preferred to under
compilation. Most of the time, the results should be the same.
The scripted test that I added modifies a file during compilation by
invoking a macro. It then effectively asserts that the file is
recompiled during the next test run by validating the compilation result
in the test. The test fails on the latest develop hash.
Changed resources were causing the dependency layer to be invalidated on
resource changes in turbo mode because the resource layer was in between
the scala library layer. This commit reworks the layers for the
AllDependencyJars strategy so that the top layer is able to load _all_
of the resources during a test run.
The resource layer was added to address the problem that dependencies
may need to be able to load resources from the project classpath but
wouldn't be able to do so if the dependencies were in a separate layer
from the rest of the classpath. The resource layer was a classloader
that could load any resource on the full classpath but couldn't load any
classes. When I added the resource layer, I was thinking that when
resources changed, the resource class loader needed to be invalidated.
Resources, however, are different from classes in that the same
ClassLoader can find the same resources in a different place because
getResource and getResourceAsStream just return locations but do not
actually do any loading.
Taking advantage of this, I add a proxy classloader for finding
resource locations to ReverseLookupClassLoader. We can reset the
classpath of the resource loader in
ReverseLookupClassLoaderHolder.checkout. This allows us to see the new
versions of the resources without invalidating the dependency layer.
The use of '$' in the path names for streams is a pain because, since
'$' is a special character in the shell, it makes it impossible to
directly copy and paste the paths. If we make this change, some builds
will be left with vestigial directories with $global and $build in them
until they run clean. It also would break any scripts that manually
delete these paths. That doesn't seem like a common use case, but it's
worth mentioning.
The use of the persistent file stamp cache between watch runs didn't
seem to cause any issues, but there was some chance for inconsistency
between the file stamp cache and the file system so it makes sense to
put it behind the turbo flag.
After changing the default, the watch/on-change scripted test started
failing. It turns out that the reason is that the file stamp cache
managed by the watch process was not pre-filled by task evaluation. For
this reason, the first time a source file was modified, it was treated
as a creation regardless of whether or not it actually was.
To fix this, I add logic to pre-fill the watch file stamp cache if we
are _not_ persisting the file stamps between runs.
I ran a before and after with the scala build performance benchmark tool
and setting the watchPersistFileStamps key to true reduced the median
run time by about 200ms in the non-turbo case.
There was a bug where sometimes a source file change would not trigger a
new build if the change occurred during a build. Based on the logs, it
seemed to be because a number of redundant events were generated for the
same path and they triggered the anti-entropy constraint of the file
event monitor.
To fix this, I consolidated a number of observers of the global file
tree repository into a single observer. This way, I am able to ensure
that only one event is generated per file event.
I also reworked the onEvent callback to only stamp the file once. It was
previously stamping the modified source file for all of the aggregated
tasks. In the sbt project running `~compile` meant that we were stamping
a source file 22 times whenever the file changed.
This actually uncovered a zinc issue though as well. Zinc computes and
writes the hash of the source file to the analysis file after
compilation has completed. If a source file is modified during
compilation, then the new hash is written to the analysis file even when
the compilation may have been made against the previous version of the
file. Zinc will then refuse to re-compile that file until another change
is made.
I manually verified that in the sbt project if I ran `~compile` before
this change and modified a file during compilation, then no event was
triggered (there was a log message about the event being dropped due to
the anti-entropy constraint though). After this change, if I changed a
file during compilation, it seemed to always trigger, but due to the
zinc bug, it didn't always re-compile.
This was causing an abstract type pattern T is unchecked since it is
eliminated by erasure. It was unneeded because store.get[T] return
Option[(T, Long)]. I'm surprised that the compiler complained about
this.
@olegych reported that sbt would silently swallow the 'compile' command
in the multi-command, 'run;compile;reload'. I tracked this down to the
build source check. When the build has
Global / onChangedBuildSource := ReloadOnSourceChanges, the check build
sources command return a new state with "reload" prefixed. To actually
perform the reload, I returned this modified state with the prefixed
reload command.
There were two problems with this:
1) In the auto-reload case, the current command was not run after the
reload
2) If the multi-command contained reload, the auto-reload check would
have a false positive which triggered the bug in (1)
To fix this, I clear out the remaining commands before I run the check
command. That way, we know that if the remaining commands has a reload,
then it is an auto-reload. We then prefix the state with both the reload
and the current command.
I updated the scripted test for auto-reload to handle multi commands
containing reload.
In this commit, I both restore some sbt 1.2.8 behavior and enhance the
api for setting keyboard shortcuts in watch. I change the default start
message to just show the watch count, the tasks that are being monitored
and, on a new line, the instructions to terminate the watch or show more
options.
Here's what it looks like:
[info] 1. Monitoring source files for spark/compile...
[info] Press <enter> to interrupt or '?' for more options.
?
[info] Options:
[info] <enter> : interrupt (exits sbt in batch mode)
[info] <ctrl-d> : interrupt (exits sbt in batch mode)
[info] 'r' : re-run the command
[info] 's' : return to shell
[info] 'q' : quit sbt
[info] '?' : print options
I also made it so that the new options can be added (and old options
removed) with the watchInputOptions key. For example, to add an option
to reload the build with the key 'l', you could add
ThisBuild / watchInputOptions += Watch.InputOption('l', "reload", Watch.Reload)
to your global build.sbt.
After adding that to my global ~/sbt/1.0/global.sbt file, the output of
'?' became:
[info] Options:
[info] <ctrl-d> : interrupt (exits sbt in batch mode)
[info] <enter> : interrupt (exits sbt in batch mode)
[info] '?' : print options
[info] 'l' : reload
[info] 'q' : quit sbt
[info] 'r' : re-run the command
[info] 's' : return to shell
At the suggestion of @eed3si9n, instead of specifying the file cache
size in bytes, we now specify it in a formatted string. For example,
instead of specifying 128 megabytes in bytes (134217728), we can specify
it with the string "128M".
It can be quite slow to read and parse a large json file. Often, we are
reading and writing the same file over and over even though it isn't
actually changing. This is particularly noticeable with the
UpdateReport*. To speed this up, I introduce a global cache that can be
used to read values from a CacheStore. When using the cache, I've seen
the time for the update task drop from about 200ms to about 1ms. This
ends up being a 400ms time savings for test because update is called for
both Compile / compile and Test / compile.
The way that this works is that I add a new abstraction
CacheStoreFactoryFactory, which is the most enterprise java thing I've
ever written. We store a CacheStoreFactoryFactory in the sbt State.
When we make Streams for the task, we make the Stream's
cacheStoreFactory field using the CacheStoreFactoryFactory. The
generated CacheStoreFactory may or may not refer to a global cache.
The CacheStoreFactoryFactory may produce CacheStoreFactory instances
that delegate to a Caffeine cache with a max size parameter that is
specified in bytes by the fileCacheSize setting (which can also be set
with -Dsbt.file.cache.size). The size of the cache entry is estimated by
the size of the contents on disk. Since we are generally storing things
in the cache that are serialized as json, I figure that this should be a
reasonable estimate. I set the default max cache size to 128MB, which is
plenty of space for the previous cache entries for most projects. If the
size is set to 0, the CacheStoreFactoryFactory generates a regular
DirectoryStoreFactory.
To ensure that the cache accurately reflects the disk state of the
previous cache (or other cache's using a CacheStore), the Caffeine cache
stores the last modified time of the file whose contents it should
represent. If there is a discrepancy in the last modified times (which
would happen if, say, clean has been run), then the value is read from
disk even if the value hasn't changed.
* With the following build.sbt file, it takes roughly 200ms to read and
parse the update report on my compute:
libraryDependencies += "org.apache.spark" %% "spark-sql" % "2.4.3"
libraryDependencies += "org.scalatest" %% "scalatest" % "3.0.1"
This is because spark-sql has an enormous number of dependencies and the
update report ends up being 3MB.
Fixes#4802
For Ivy integration sbt uses credential task in a peculiar way. 9fa25de84c/main/src/main/scala/sbt/Defaults.scala (L2271-L2275)
This lets the build user put `credential` task in various places, like metabuild or root project, but they would all act as if they were scoped globally. This PR adds `allCredentials` task to emulate the behavior to pass credentials into lm-coursier.
This Resolver had the same name as the typesafe ivy resolver specified
in the launcher boot.properties. It was creating a number of verbose
warnings about having multiple resolvers with the same name. I noticed
that the ivy pattern is slightly different for the boot resolver with
this name. It didn't seem to be causing any problems to have both
resolvers.
Fixes#4839
I noticed that for a simple spark project that evaluating the test task
was faster than running run when both tasks evaluated the same code
block. I tracked this down to the BackgroundJobService.copyClasspath
method. This method was hashing the jar contents of all of the files in
the build. On my computer, this took 600ms (for context, the total run
time of the `run` task was around 1.2 seconds, which included about
150ms of scala compiling and 350ms of time in the main method). If
instead we use the last modified time it drops down to 5-10ms. As
predicted, the total runtime of `run` in this project dropped down to
600ms which was on par with `test`.
I am not sure why a hash was used rather than last modified in the first place,
so I reworked things in such a way that, by default, sbt will use a hash
but if turbo mode is on, it will use the last modified instead. We can
revisit the default later.
The multi parser had very poor performance if there were many commands.
Evaluating the expansion of something like "compile;" * 30 could cause
sbt to hang indefinitely. I believe this was due to excessive
backtracking due to the optional `(parser <~ semi.?).?` part of the
parser in the non-leading semicolon case.
I also reworked the implementation so that the multi command now has a
name. This allows us to partition the commands into multi and non-multi
commands more easily in State while still having multi in the command
list. With this change, builds and plugins can exclude the multi parser
if they wish.
Using the partitioned parsers, I removed the high/priority low priority
distinction. Instead, I made it so that the multi command will actually
check if the first command is a named command, like '~'. If it is, it
will pass the raw command argument with the named command stripped out
into the parser for the named command. If that is parseable, then we
directly apply the effect. Otherwise we prefix each multi command to the
state.
The `session save` command has the side effect of modifying a "*.sbt"
file so we don't want to warn about changes or automatically reload when
we return to the shell. Setting the hasCheckedMetaBuild attribute key to
false is sufficient to prevent this.
Ref: https://github.com/sbt/sbt/issues/4813
It didn't really make sense for Continuous to use the other command
parser and then reparse the results. I was able to slightly simplify
things by using the multi parser directly.
It was reported in https://github.com/sbt/sbt/issues/4808 that compared
to 1.2.8, sbt 1.3.0-RC2 will truncate the command args of an input task
that contains semicolons. This is actually intentional, but not
completely robust. For sbt >= 1.3.0, we are making ';' syntactically
meaningful. This means that it always represents a command separator
_unless_ it is inside of a quoted string. To enforce this, the multi parser
will effectively split the input on ';', it will then validate that each
command that it extracted is valid. If not, it throws an exception. If
the input is not a multi command, then parsing fails with a normal
failure.
I removed the multi command from the state's defined commands and reworked
State.combinedParser to explicitly first try multi parsing and fall back
to the regular combined parser if it is a regular command. If the multi
parser throws an uncaught exception, parsing fails even if the regular
parser could have successfully parsed the command. The reason is so that
we do not ever allow the user to evaluate, say 'run a;b'. Otherwise the
behavior would be inconsitent when the user runs 'compile; run a;b'
It's very expensive to compute the hash code of a deeply nested
Incomplete. To prevent a loop, we only want to check for object equality
which we can do with IdentityHashMap
It didn't make sense to aggregate the watch start command if it was
defined in multiple sources so we previously just fell back to the
default message if multiple commands were being run. This, however,
meant that if you ran, say, ~compile in an aggregate project, it wasn't
possible to customize the start message. There was a message in the
sbt gitter channel where someone found the new message too verbose and
wanted to print something shorter and I realized that this was an
unfortunate restriction. Instead of giving up, we can just use the
project's watchStartMessage as a default. If the watchStartMessage
setting is unset for some reason, we can fall back to the default.
I validated this change manually in the swoval project, which has an
aggregate root project, by running
set ThisBuild / watchStartMessage := { (_, _, _) => None }
and indeed nothing was printed after each task evaluation in '~compile'.
We discovered that turbo mode did not work in the sbt settings project.
I tracked this down to the dependency classloader bundle not being
thread safe.
I realized that some builds may crash if we automatically close the
classloaders. While I do think that is a good thing in general that we
are closing the loaders by default, we shuold have an option for
supressing this behavior.
I made all of the custom classloaders that we define for test and run
check this property before calling the super.close method.
We want to notify users about the new features available in sbt 1.3.0 to
increase visibility. Turbo mode especially can benefit many builds, but
we have opted to leave it off by default for now.
The banner will be displayed the first time sbt enters the shell command
on each sbt run. The banner can be disabled globally with the sbt.banner
system property. It can be displayed on a per sbt version by running the
skipWelcomeBanner command. That command touches a file in the ~/.sbt/1.0
directory to make it persistent across all projects.
I decided creations/deletions/updates were a bit too technical rather
than descriptive. It also wasn't really correct to say Meta build
sources because the meta build is the build for the build. Instead, I
dropped Meta from the sentence. I also made the instructions when
changed sources are detected more active. I left them capitalized since
they are instructions rather than warnings.
Apply these changes by running `reload`.
Automatically reload the build when source changes are detected by setting `Global / onChangedBuildSource := ReloadOnSourceChanges`.
Disable this warning by setting `Global / onChangedBuildSource := IgnoreSourceChanges`.
Also indentation was wrong for the printed files when multiple files had
changed because the mkString middle argument was " \n" rather than "\n ".
This creates a performance mode that enables experimental or advanced features that might require some debugging by the build user when it doesn't work.
Initially we are putting the layered ClassLoader (`ClassLoaderLayeringStrategy.AllLibraryJars`) behind this flag.