The use of the persistent file stamp cache between watch runs didn't
seem to cause any issues, but there was some chance for inconsistency
between the file stamp cache and the file system so it makes sense to
put it behind the turbo flag.
After changing the default, the watch/on-change scripted test started
failing. It turns out that the reason is that the file stamp cache
managed by the watch process was not pre-filled by task evaluation. For
this reason, the first time a source file was modified, it was treated
as a creation regardless of whether or not it actually was.
To fix this, I add logic to pre-fill the watch file stamp cache if we
are _not_ persisting the file stamps between runs.
I ran a before and after with the scala build performance benchmark tool
and setting the watchPersistFileStamps key to true reduced the median
run time by about 200ms in the non-turbo case.
There was a bug where sometimes a source file change would not trigger a
new build if the change occurred during a build. Based on the logs, it
seemed to be because a number of redundant events were generated for the
same path and they triggered the anti-entropy constraint of the file
event monitor.
To fix this, I consolidated a number of observers of the global file
tree repository into a single observer. This way, I am able to ensure
that only one event is generated per file event.
I also reworked the onEvent callback to only stamp the file once. It was
previously stamping the modified source file for all of the aggregated
tasks. In the sbt project running `~compile` meant that we were stamping
a source file 22 times whenever the file changed.
This actually uncovered a zinc issue though as well. Zinc computes and
writes the hash of the source file to the analysis file after
compilation has completed. If a source file is modified during
compilation, then the new hash is written to the analysis file even when
the compilation may have been made against the previous version of the
file. Zinc will then refuse to re-compile that file until another change
is made.
I manually verified that in the sbt project if I ran `~compile` before
this change and modified a file during compilation, then no event was
triggered (there was a log message about the event being dropped due to
the anti-entropy constraint though). After this change, if I changed a
file during compilation, it seemed to always trigger, but due to the
zinc bug, it didn't always re-compile.
This was causing an abstract type pattern T is unchecked since it is
eliminated by erasure. It was unneeded because store.get[T] return
Option[(T, Long)]. I'm surprised that the compiler complained about
this.
@olegych reported that sbt would silently swallow the 'compile' command
in the multi-command, 'run;compile;reload'. I tracked this down to the
build source check. When the build has
Global / onChangedBuildSource := ReloadOnSourceChanges, the check build
sources command return a new state with "reload" prefixed. To actually
perform the reload, I returned this modified state with the prefixed
reload command.
There were two problems with this:
1) In the auto-reload case, the current command was not run after the
reload
2) If the multi-command contained reload, the auto-reload check would
have a false positive which triggered the bug in (1)
To fix this, I clear out the remaining commands before I run the check
command. That way, we know that if the remaining commands has a reload,
then it is an auto-reload. We then prefix the state with both the reload
and the current command.
I updated the scripted test for auto-reload to handle multi commands
containing reload.
In this commit, I both restore some sbt 1.2.8 behavior and enhance the
api for setting keyboard shortcuts in watch. I change the default start
message to just show the watch count, the tasks that are being monitored
and, on a new line, the instructions to terminate the watch or show more
options.
Here's what it looks like:
[info] 1. Monitoring source files for spark/compile...
[info] Press <enter> to interrupt or '?' for more options.
?
[info] Options:
[info] <enter> : interrupt (exits sbt in batch mode)
[info] <ctrl-d> : interrupt (exits sbt in batch mode)
[info] 'r' : re-run the command
[info] 's' : return to shell
[info] 'q' : quit sbt
[info] '?' : print options
I also made it so that the new options can be added (and old options
removed) with the watchInputOptions key. For example, to add an option
to reload the build with the key 'l', you could add
ThisBuild / watchInputOptions += Watch.InputOption('l', "reload", Watch.Reload)
to your global build.sbt.
After adding that to my global ~/sbt/1.0/global.sbt file, the output of
'?' became:
[info] Options:
[info] <ctrl-d> : interrupt (exits sbt in batch mode)
[info] <enter> : interrupt (exits sbt in batch mode)
[info] '?' : print options
[info] 'l' : reload
[info] 'q' : quit sbt
[info] 'r' : re-run the command
[info] 's' : return to shell
At the suggestion of @eed3si9n, instead of specifying the file cache
size in bytes, we now specify it in a formatted string. For example,
instead of specifying 128 megabytes in bytes (134217728), we can specify
it with the string "128M".
It can be quite slow to read and parse a large json file. Often, we are
reading and writing the same file over and over even though it isn't
actually changing. This is particularly noticeable with the
UpdateReport*. To speed this up, I introduce a global cache that can be
used to read values from a CacheStore. When using the cache, I've seen
the time for the update task drop from about 200ms to about 1ms. This
ends up being a 400ms time savings for test because update is called for
both Compile / compile and Test / compile.
The way that this works is that I add a new abstraction
CacheStoreFactoryFactory, which is the most enterprise java thing I've
ever written. We store a CacheStoreFactoryFactory in the sbt State.
When we make Streams for the task, we make the Stream's
cacheStoreFactory field using the CacheStoreFactoryFactory. The
generated CacheStoreFactory may or may not refer to a global cache.
The CacheStoreFactoryFactory may produce CacheStoreFactory instances
that delegate to a Caffeine cache with a max size parameter that is
specified in bytes by the fileCacheSize setting (which can also be set
with -Dsbt.file.cache.size). The size of the cache entry is estimated by
the size of the contents on disk. Since we are generally storing things
in the cache that are serialized as json, I figure that this should be a
reasonable estimate. I set the default max cache size to 128MB, which is
plenty of space for the previous cache entries for most projects. If the
size is set to 0, the CacheStoreFactoryFactory generates a regular
DirectoryStoreFactory.
To ensure that the cache accurately reflects the disk state of the
previous cache (or other cache's using a CacheStore), the Caffeine cache
stores the last modified time of the file whose contents it should
represent. If there is a discrepancy in the last modified times (which
would happen if, say, clean has been run), then the value is read from
disk even if the value hasn't changed.
* With the following build.sbt file, it takes roughly 200ms to read and
parse the update report on my compute:
libraryDependencies += "org.apache.spark" %% "spark-sql" % "2.4.3"
libraryDependencies += "org.scalatest" %% "scalatest" % "3.0.1"
This is because spark-sql has an enormous number of dependencies and the
update report ends up being 3MB.
Fixes#4802
For Ivy integration sbt uses credential task in a peculiar way. 9fa25de84c/main/src/main/scala/sbt/Defaults.scala (L2271-L2275)
This lets the build user put `credential` task in various places, like metabuild or root project, but they would all act as if they were scoped globally. This PR adds `allCredentials` task to emulate the behavior to pass credentials into lm-coursier.
This Resolver had the same name as the typesafe ivy resolver specified
in the launcher boot.properties. It was creating a number of verbose
warnings about having multiple resolvers with the same name. I noticed
that the ivy pattern is slightly different for the boot resolver with
this name. It didn't seem to be causing any problems to have both
resolvers.
Fixes#4839
I noticed that for a simple spark project that evaluating the test task
was faster than running run when both tasks evaluated the same code
block. I tracked this down to the BackgroundJobService.copyClasspath
method. This method was hashing the jar contents of all of the files in
the build. On my computer, this took 600ms (for context, the total run
time of the `run` task was around 1.2 seconds, which included about
150ms of scala compiling and 350ms of time in the main method). If
instead we use the last modified time it drops down to 5-10ms. As
predicted, the total runtime of `run` in this project dropped down to
600ms which was on par with `test`.
I am not sure why a hash was used rather than last modified in the first place,
so I reworked things in such a way that, by default, sbt will use a hash
but if turbo mode is on, it will use the last modified instead. We can
revisit the default later.
In some cases, comp.append could be an empty string. This would happen
if a parser was something like `(token(foo) <~ ;).+ <~ fo.?` because there were
no completions for the `fo` available anchor. The effect of this was
that tab would never complete foo;f to foo;foo, even though that was the
only possible completion. It would, _display_, foo as a possible
completion though.
This came up because the multi parser has a similar parser to that
described above and it broke tab completion to the right of a semi
colon.
The multi parser had very poor performance if there were many commands.
Evaluating the expansion of something like "compile;" * 30 could cause
sbt to hang indefinitely. I believe this was due to excessive
backtracking due to the optional `(parser <~ semi.?).?` part of the
parser in the non-leading semicolon case.
I also reworked the implementation so that the multi command now has a
name. This allows us to partition the commands into multi and non-multi
commands more easily in State while still having multi in the command
list. With this change, builds and plugins can exclude the multi parser
if they wish.
Using the partitioned parsers, I removed the high/priority low priority
distinction. Instead, I made it so that the multi command will actually
check if the first command is a named command, like '~'. If it is, it
will pass the raw command argument with the named command stripped out
into the parser for the named command. If that is parseable, then we
directly apply the effect. Otherwise we prefix each multi command to the
state.
The `session save` command has the side effect of modifying a "*.sbt"
file so we don't want to warn about changes or automatically reload when
we return to the shell. Setting the hasCheckedMetaBuild attribute key to
false is sufficient to prevent this.
Ref: https://github.com/sbt/sbt/issues/4813
It didn't really make sense for Continuous to use the other command
parser and then reparse the results. I was able to slightly simplify
things by using the multi parser directly.
We run into issues if we naively split the command input on ';' and
treat each part as a separate command unless the ';' is inside of a
string because it is also valid to have ';'s inside of braced
expressions, e.g. `set foo := { val x = 1; x + 1 }`. There was no parser
for expressions enclosed in braces. I add one that should parse any
expression wrapped in braces so long as each opening brace is matched by a
closing brace. The parser returns the original expression. This allows
the multi parser to ignore ';' inside of '{...}'.
I had to rework the scripted tests to individually run 'reload' and
'setUpScripted' because the new parser rejects setUpScripted because it
isn't a valid command until reload has run.
It was reported in https://github.com/sbt/sbt/issues/4808 that compared
to 1.2.8, sbt 1.3.0-RC2 will truncate the command args of an input task
that contains semicolons. This is actually intentional, but not
completely robust. For sbt >= 1.3.0, we are making ';' syntactically
meaningful. This means that it always represents a command separator
_unless_ it is inside of a quoted string. To enforce this, the multi parser
will effectively split the input on ';', it will then validate that each
command that it extracted is valid. If not, it throws an exception. If
the input is not a multi command, then parsing fails with a normal
failure.
I removed the multi command from the state's defined commands and reworked
State.combinedParser to explicitly first try multi parsing and fall back
to the regular combined parser if it is a regular command. If the multi
parser throws an uncaught exception, parsing fails even if the regular
parser could have successfully parsed the command. The reason is so that
we do not ever allow the user to evaluate, say 'run a;b'. Otherwise the
behavior would be inconsitent when the user runs 'compile; run a;b'
I noticed that in 1.3.0-RC2, the following commands worked:
++2.11.12 ;test
;++2.11.12;test
but
++2.11.12; test
did not. This issue is fixed in the next commit.