Previously incremental compiler was extracting source code
dependencies by inspecting `CompilationUnit.depends` set. This set is
constructed by Scala compiler and it contains all symbols that given
compilation unit refers or even saw (in case of implicit search).
There are a few problems with this approach:
* The contract for `CompilationUnit.depend` is not clearly defined
in Scala compiler and there are no tests around it. Read: it's
not an official, maintained API.
* Improvements to incremental compiler require more context
information about given dependency. For example, we want to
distinguish between dependency on a class when you just select
members from it or inherit from it. The other example is that
we might want to know dependencies of a given class instead of
the whole compilation unit to make the invalidation logic more
precise.
That led to the idea of pushing dependency extracting logic to
incremental compiler side so it can evolve indepedently from Scala
compiler releases and can be refined as needed. We extract
dependencies of a compilation unit by walking a type-checked tree
and gathering symbols attached to them.
Specifically, the tree walk is implemented as a separate phase that
runs after pickler and extracts symbols from following tree nodes:
* `Import` so we can track dependencies on unused imports
* `Select` which is used for selecting all terms
* `Ident` used for referring to local terms, package-local terms
and top-level packages
* `TypeTree` which is used for referring to all types
Note that we do not extract just a single symbol assigned to `TypeTree`
node because it might represent a complex type that mentions
several symbols. We collect all those symbols by traversing the type
with CollectTypeTraverser. The implementation of the traverser is inspired
by `CollectTypeCollector` from Scala 2.10. The
`source-dependencies/typeref-only` test covers a scenario where the
dependency is introduced through a TypeRef only.
Introduce an alternative source dependency tracking mechanism that is
needed by upcoming name hashing algorithm. This new mechanism is
implemented by introducing two new source dependency relations called
`memberRef` and `inheritance`.
Those relations are very similar to existing `direct` and
`publicInherited` relations in some subtle ways. Those differences
will be highlighted in the description below.
Dependencies between source files are tracked in two distinct
categories:
* dependencies introduced by inheriting from a class/trait
defined in other source file
* dependencies introduced by referring (selecting) a member
defined in other source file (that covers all other
kinds of dependencies)
Due to invalidation algorithm implementation details sbt would need to
track inheritance dependencies of public classes only. Thus, we had
relation called `publicInherited`. The name hashing algorithm which
improves invalidation logic will need more precise information about
dependencies introduced by inheritance including dependencies of non-public
classes. That's one difference between `inheritance` and `publicInherited`
relations.
One surprising (to me) thing about `publicInherited` is that it includes
all base classes of a given class and not just parents. In that sense
`publicInherited` is transitive. This is a bit irregular because
everything else in Relations doesn't include transitive dependencies.
Since we are introducing new relations we have an excellent chance to
make things more regular. Therefore `inheritance` relation is
non-transitive and includes only extracted parent classes.
The access to `direct`, `publicInherited`, `memberRef` and `inheritance`
relations is dependent upon the value of `memberRefAndInheritanceDeps`
flag. Check documentation of that flag for details.
The two alternatives for source dependency tracking are implemented by
introduction of two subclasses that implement Relations trait and one
abstract class that contains some common logic shared between those two
subclasses. The two new subclasses are needed for the time being when we
are slowly migrating to the name hashing algorithm which requires
subtle changes to dependency tracking as explained above. For some time we
plan to keep both algorithms side-by-side and have a runtime switch which
allows to pick one. So we need logic for both old and new dependency
tracking to be available. That's exactly what two subclasses of
MRelationsCommon implement. Once name hashing is proven to be stable and
reliable we'll phase out the old algorithm and the old dependency tracking
logic.
When the `source-dependencies/inherited-dependencies` test fails we
get a dump of a big collection of all dependencies with absolute
file paths printed. This is not very readable when one needs to
understand the actual difference.
I decided to test dependencies of each source file separately. This way
when assertion exception is thrown we get a stack trace that points
us at the line which tested dependencies of a specific source file.
Also, all files are relative to baseDirectory of the project.
This avoids an additional cause of recursion via the semicolon/multiple command, which fixes#933.
It also provides error messages on the expanded command. This fixes#598.
The fix was made possible by the very helpful information provided by @retronym.
This commit does two key things:
1. changes the owner when splicing original trees into new trees
2. ensures the synthetic trees that get spliced into original trees do not need typechecking
Given this original source (from Defaults.scala):
...
lazy val sourceConfigPaths = Seq(
...
unmanagedSourceDirectories := Seq(scalaSource.value, javaSource.value),
...
)
...
After expansion of .value, this looks something like:
unmanagedSourceDirectories := Seq(
InputWrapper.wrapInit[File](scalaSource),
InputWrapper.wrapInit[File](javaSource)
)
where wrapInit is something like:
def wrapInit[T](a: Any): T
After expansion of := we have (approximately):
unmanagedSourceDirectories <<=
Instance.app( (scalaSource, javaSource) ) {
$p1: (File, File) =>
val $q4: File = $p1._1
val $q3: File = $p1._2
Seq($q3, $q4)
}
So,
a) `scalaSource` and `javaSource` are user trees that are spliced into a tuple constructor after being temporarily held in `InputWrapper.wrapInit`
b) the constructed tuple `(scalaSource, javaSource)` is passed as an argument to another method call (without going through a val or anything) and shouldn't need owner changing
c) the synthetic vals $q3 and $q4 need their owner properly set to the anonymous function
d) the references (Idents) $q3 and $q4 are spliced into the user tree `Seq(..., ...)` and their symbols need to be the Symbol for the referenced vals
e) generally, treeCopy needs to be used when substituting Trees in order to preserve attributes, like Types and Positions
changeOwner is called on the body `Seq($q3, $q4)` with the original owner sourceConfigPaths to be changed to the new anonymous function.
In this example, no owners are actually changed, but when the body contains vals or anonymous functions, they will.
An example of the compiler crash seen when the symbol of the references is not that of the vals:
symbol value $q3 does not exist in sbt.Defaults.sourceConfigPaths$lzycompute
at scala.reflect.internal.SymbolTable.abort(SymbolTable.scala:49)
at scala.tools.nsc.Global.abort(Global.scala:254)
at scala.tools.nsc.backend.icode.GenICode$ICodePhase.genLoadIdent$1(GenICode.scala:1038)
at scala.tools.nsc.backend.icode.GenICode$ICodePhase.scala$tools$nsc$backend$icode$GenICode$ICodePhase$$genLoad(GenICode.scala:1044)
at scala.tools.nsc.backend.icode.GenICode$ICodePhase$$anonfun$genLoadArguments$1.apply(GenICode.scala:1246)
at scala.tools.nsc.backend.icode.GenICode$ICodePhase$$anonfun$genLoadArguments$1.apply(GenICode.scala:1244)
...
Other problems with the synthetic tree when it is spliced under the original tree often result in type mismatches or some other compiler error that doesn't result in a crash.
If the owner is not changed correctly on the original tree that gets spliced under a synthetic tree, one way it can crash the compiler is:
java.lang.IllegalArgumentException: Could not find proxy for val $q23: java.io.File in List(value $q23, method apply, anonymous class $anonfun$globalCore$5, value globalCore, object Defaults, package sbt, package <root>) (currentOwner= value dir )
...
while compiling: /home/mark/code/sbt/main/src/main/scala/sbt/Defaults.scala
during phase: global=lambdalift, atPhase=constructors
...
last tree to typer: term $outer
symbol: value $outer (flags: <synthetic> <paramaccessor> <triedcooking> private[this])
symbol definition: private[this] val $outer: sbt.BuildCommon
tpe: <notype>
symbol owners: value $outer -> anonymous class $anonfun$87 -> value x$298 -> method derive -> class BuildCommon$class -> package sbt
context owners: value dir -> value globalCore -> object Defaults -> package sbt
...
The problem here is the difference between context owners and the proxy search chain.
This feature is not activated by default. To enable it set `testForkedParallel` to `true`.
The test-agent then executes the tests in a thread pool.
For now it has a fixed size set to the number of available processors.
The concurrent restrictions configuration should be used.
The completions command is meant for dump terminals that cannot use
the default tab completion. It has been built for use by the emacs
sbt-mode (see https://github.com/hvesalai/sbt-mode), but is equally
useful for other code editors that can integrate with sbt.
SecurityManager.checkAccess(ThreadGroup) is specified to be called for every Thread creation
and every ThreadGroup creation and is therefore jvm-independent. This can be used to get all
Threads associated with an application with good enough accuracy.
An application will be marked as using AWT if it gets associated with the AWT event queue thread.
To avoid unwanted side effects of accidental AWT initialization, TrapExit only tries to dispose
frames when an application is so marked. Only one AWT application is supported due to a lack of
a way to associate displayed windows with an application.
Reads/writes are a little faster with the text format,
and it's far more useful. E.g., it allows external manipulation
and inspection of the analysis.
We don't gzip the output. It does greatly shrink the files,
however it makes reads and writes 1.5x-2x slower, and we're
optimizing for speed over compactness.
It was an omission in the original commit that introduced them and didn't
mark them as private. They are purely an implementation detail and should
be hidden. We hiding them now.
Introduce a new incremental compiler option that controls
incremental compiler's treatment of macro definitions and their clients.
The current strategy is that whenever a source file containing a macro
definition is touched it will cause recompilation of all direct
dependencies of that file.
That strategy has proven to be too conservative for some projects like
Scala compiler of specs2 leading to too many source files being recompiled.
We make this behavior optional by introducing a new option
`recompileOnMacroDef` in `IncOptions` class. The default value is set to
`true` which preserves the previous behavior.
Add methods that allow one to set a new value to one of the fields of
IncOptions class. These methods are meant to be an alternative to
copy method that is hard to keep binary compatible when new fields are
added to the class.
Each copying method is related to one field of the class so when new
fields are added existing methods (and their signatures) are unaffected.
Expand case class `IncOptions` in binary compatible way so we can have
better control of methods like `unapply` when new fields are added.
Great precaution has been taken to ensure that this commit doesn't break
binary compatibility. I took a dump of javap output before and after
this change for both the class and it's companion object.
The diff is presented below:
diff -u ~/inc-options-before ~/inc-options-after
--- /Users/grek/inc-options-before 2013-11-03 14:48:45.000000000 +0100
+++ /Users/grek/inc-options-after 2013-11-03 15:53:10.000000000 +0100
@@ -9,7 +9,11 @@
public static java.lang.String transitiveStepKey();
public static sbt.inc.IncOptions setTransactional(sbt.inc.IncOptions, java.io.File);
public static sbt.inc.IncOptions defaultTransactional(java.io.File);
+ public static scala.Option unapply(sbt.inc.IncOptions);
+ public static sbt.inc.IncOptions apply(int, double, boolean, boolean, int, scala.Option, scala.Function0);
public static sbt.inc.IncOptions Default();
+ public static scala.Function1 tupled();
+ public static scala.Function1 curried();
public int transitiveStep();
public double recompileAllFraction();
public boolean relationsDebug();
diff -u inc-options-module-before inc-options-module-after
--- inc-options-module-before 2013-11-03 14:48:55.000000000 +0100
+++ inc-options-module-after 2013-11-12 21:00:41.000000000 +0100
@@ -3,6 +3,9 @@
public static final sbt.inc.IncOptions$ MODULE$;
public static {};
public sbt.inc.IncOptions Default();
+ public final java.lang.String toString();
+ public sbt.inc.IncOptions apply(int, double, boolean, boolean, int, scala.Option, scala.Function0);
+ public scala.Option unapply(sbt.inc.IncOptions);
public sbt.inc.IncOptions defaultTransactional(java.io.File);
public sbt.inc.IncOptions setTransactional(sbt.inc.IncOptions, java.io.File);
public java.lang.String transitiveStepKey();
@@ -13,7 +16,5 @@
public java.lang.String apiDiffContextSize();
public sbt.inc.IncOptions fromStringMap(java.util.Map);
public java.util.Map toStringMap(sbt.inc.IncOptions);
- public sbt.inc.IncOptions apply(int, double, boolean, boolean, int, scala.Option, scala.Function0);
- public scala.Option unapply(sbt.inc.IncOptions);
}
The first diff shows that there are just more static forwarders defined
for top-level companion object and that is binary compatible change.
The second diff shows that there are just a few minor differences in
order in which `unapply`, `apply` and bridge method for `apply` are
defined. Also, there's a new `toString` declaration. All those changes are
binary compatible.
All methods that are generated for a case class are marked as deprecated
and will be removed in the future.
The main motivation behind this commit is to reify information about
api changes that incremental compiler considers. We introduce a new
sealed class `APIChange` that has (at the moment) two subtypes:
* APIChangeDueToMacroDefinition - as the name explains, this represents
the case where incremental compiler considers an api to be changed
just because given source file contains a macro definition
* SourceAPIChange - this represents the case of regular api change;
at the moment it's just a simple wrapper around value representing
source file but in the future it will get expanded to contain more
detailed information about API changes (e.g. collection of changed
name hashes)
The APIChanges becomes just a collection of APIChange instances.
In particular, I removed `names` field that seems to be a dead code in
incremental compiler. The `NameChanges` class and methods that refer to
it in `SameAPI` has been deprecated.
The Incremental.scala has been adapted to changed signature of APIChanges
class. The `sameSource` method returns representation of APIChange
(if there's one) instead of just simple boolean. One notable change is
that information about APIChanges is pushed deeper into invalidation logic.
This will allow us to treat the APIChangeDueToMacroDefinition case properly
once name hashing scheme arrives.
This commit shouldn't change any behavior and is purely a refactoring.
The following events are logged:
* invalidation of source file due to macro definition
* inclusion of dependency invalidated by inheritance; we log both
nodes of dependency edge (dependent and dependency)
The second bullet helps to understand what's going on in case of
complex inheritance hierarchies like in Scala compiler.
The #958 describes a scenario where partially successful results are
produced in form of class files written to disk. However, if compilation
fails down the road we do not record any new compilation results (products)
in Analysis object. This leads to Analysis object and disk contents to get
out of sync.
One way to solve this problem is to use transactional ClassfileManager that
commits changes to class files on disk only when entire incremental
compilation session is successful. Otherwise, new class files are rolled
back to previous state.
The other way to solve this problem is to record time stamps of class files
in Analysis object. This way, incremental compiler can detect that class
files and Analysis object got out of sync and recover from that by
recompiling corresponding sources.
This commit uses latter solution which enables simpler (non-transactional)
ClassfileManager to handle scenario from #958.
Fixes#958
* Deprecate old mainClass method on appProvider
* Create entryPoint method which represnets class used to launch instead.
* Create PlainApplication to wrap static `main` methods. Can return
either Int, Exit or Unit.
* Detect supported 'plain' classes via reflection.
* Add new unit tests appropriate to the feature.
The ticket contains detailed description of the problem. The test case
just shows that if we set `incOptions := sbt.inc.IncOptions.Default`
then the incremental compiler doesn't recover from compiler errors
properly.
This is a temporary workaround: it assumes nothing else uses these streams later.
This condition is ok for 'export' and test streams, since these are unlikely to
reuse these streams. However, the proper fix is for the TaskStreams methods
to be smarter- they could open in append mode if the stream was closed. The
streams associated with a task could be optimistically closed after it finishes executing.
(Any task can write to another task's streams, which is why it is an optimization only.)
If an exception is thrown when accepting a connection from a forked test
agent, currently I'm seeing that all that happens is SBT hangs with no
output. Thread dumps show that the main process is waiting for the
agent to return, while the agent is waiting for the server to send it
something.
This change logs the exception, so that at least the error can be
googled. It also cleans up the server socket.
This protects against deadlocks between the writing and reading end,
since the ObjectOutputStream constructor writes a header, but does not
flush, and the ObjectInputStream constructor reads the header, and
blocks until it's read.