Serializes CompileSetup as text instead of base64-encoded
binary-serialized object.
This is necessary so that file paths in the CompileSetup can be
rebased when porting analysis files between systems.
Provide implementation of invalidation logic that takes computed
name hashes into account. The implementation is spread amongst two
classes:
1. `IncrementalNameHashing` which implements a variant of
incremental compilation algorithm that computes modified
names and delegates to `MemberReferenceInvalidationStrategy`
when invalidating member reference dependencies
2. `MemberReferenceInvalidationStrategy` which implements the
core logic of dealing with dependencies introduced by member
reference. See documentation of that class for details.
The name hashing optimization is applied when invalidating source files
having both internal and external dependencies (in initial iteration),
check `invalidateByExternal` and `invalidateSource` methods for details.
As seen in implementation of `MemberReferenceInvalidationStrategy`
the name hashing optimization is not applied when implicit members
change.
NOTE: All functionality introduced in this commit is enabled only
when `IncOptions.nameHashing` flag is set to true.
The `source-dependencies/transitive-memberRef` test has been changed
to test name hashing variant of incremental compilation. The change
to invalidated files reflects the difference between the old and the
new algorithm.
Also, there a few new tests added that cover issues previously found
while testing name hashing algorithm and are fixed in this commit.
Each paragraph describes a single test.
Add a test case which shows that detect properly changes to type aliases
in the name hashing algorithm. See gkossakowski/sbt#6 for details.
Add test covering bug with use of symbolic names (issue
gkossakowski/sbt#5).
Add a test which covers the case where we refer to a name that is
declared in the same file. See issue gkossakowski/sbt#3 for details.
There are two categories of places in the code that need to refer to
`nameHashing` option:
* places where Analysis object is created so it gets proper
implementation of underlying Relations object
* places with logic that is specifically designed to be enabled by
that option
This commit covers both cases.
The 39036e7c20 introduced
`recompileOnMacroDef` option to IncOptions. However, not all necessary
logic has been changed. This commit fixes that:
* `copy` method does not forget the value of the `recompileOnMacroDef`
flag
* `productArity` has been increased to match the arity of the class
* `productElement` returns the value of `recompileOnMacroDef` flag
* `hashCode` and `equals` methods take into account value of
`recompileOnMacroDef` flag
* fix the name of the key for `recompileOnMacroDef` flag
Move implementation of the following methods from IncrementalCommon
to IncrementalDefaultImpl:
* invalidatedPackageObjects
* sameAPI
* invalidateByExternal
* allDeps
* invalidateSource
These are the methods that are expected to have different implementation
in the name hashing algorithm. Hence, we make them abstract in
IncrementalCommon so they can be implemented differently in subclasses.
Refactor the `invalidateByExternal` method to take single, external
api change. Introduce `invalidateByAllExternal` that takes all APIChanges
object.
This way `invalidateByExternal` will have an access to APIChange object
that represents changed name hashes once name hashing is merged.
This way we'll be able to have a polymorphic implementation of this
method in the future. One implementation will use the old dependency
tracking mechanism and the other will use the new one (implemented
for name hashing).
In addition to `invalidateSources` we introduce `invalidateSource`
that invalidates dependencies of a single source. This is needed
for the name hashing algorithm because its invalidation logic
depends on information about API changes of each source file
individually.
The refactoring is done in `IncrementalCommon` class so it affects
the default implementation as well. However, this refactoring does
not affect the result of invalidation in the default implementation.
Introduce an abstract `IncrementalCommon class that holds the
implementation of incremental compiler that was previously done in
`Incremental` class. Also, introduce `IncrementalDefaultImpl` that
inherits from IncrementalCommon.
This is the first step to introduce a design where most of incremental
compiler's logic lives in IncrementalCommon and we have two subclasses:
1. Default, which holds implementation specific to the old algorithm
known from sbt 0.13.0
2. NameHashing, which holds implementation specific to the name
hashing algorithm
This commit is purely a refactoring and does not change any behavior.
Both Logger and IncOptions instances were passed around Incremental class
implementation unmodified. Given the fact that entire implementation of
the class uses exactly the same values for those types it makes sense
to extract them as constructor arguments so they are accessible everywhere.
This helps reducing signatures of other methods to more essential
parameters that are more specific to given method.
Move most of the functionality from `Incremental` object to its
companion class.
This commit is a preparation for making it possible to have
two different implementation of logic in `Incremental` object.
Each of Relations implementation should have specific `toString`
implementation. For example, only `MRelationsNameHashing` implementation
should be printing used names in toString representation.
A hash for given name in a source file is computed by combining
hashes of all definitions with given name. When hashing a single
definition we take into account all information about it except nested
definitions. For example, if we have following definition
class Foo[T] {
def bar(x: Int): Int = ???
}
hash sum for `Foo` will include the fact that we have a class with
a single type parameter but it won't include hash sum of `bar` method.
Computed hash sums are location-sensitive. Each definition is hashed along
with its location so we properly detect cases when definition's signature
stays the same but it's moved around in the same compilation unit.
The location is defined as sequence of selections. Each selection consists
of a name and name type. The name type is either term name or type name.
Scala specification (9.2) guarantees that each publicly visible definition
is uniquely identified by a sequence of such selectors.
For example, if we have:
object Foo {
class Bar { def abc: Int }
}
then location of `abc` is Seq((TermName, Foo), (TypeName, Bar))
It's worth mentioning that we track name-hash pairs separately for
regular (non implicit) and implicit members. That's required for name
hashing algorithm because it does not apply its heuristic when implicit
members are being modified.
Another important characteristic is that we include all inherited members
when computing name hashes.
Here comes the detailed list of changes made in this commit:
* HashAPI has new parameter `includeDefinitions` that allows
shallow hashing of Structures (where we do not compute hashes
recursively)
* HashAPI exposes `finalizeHash` method that allow one to capture
current hash at any time. This is useful if you want to hash a list of
definitions and not just whole `SourceAPI`.
* NameHashing implements actual extraction of public definitions,
grouping them by simple name and computing hash sums for each group
using HashAPI
* `Source` class (defined in interface/other file) has been extended to
include `_internalOnly_nameHashes` field. This field stores
NameHashes data structure for given source file. The NameHashes
stores two separate collections of name-hash pairs for regular and
implicit members.
The prefix `_internalOnly_` is used to indicate that this is not an
official incremental compiler's or sbt's API and it's for use by
incremental compiler internals only. We had to use such a prefix
because the `datatype` code generator doesn't support emitting access
modifiers
* `AnalysisCallback` implementation has been modified to gather all
name hashes and store them in the Source object
* TestCaseGenerators has been modified to implement generation of
NameHashes
* The NameHashingSpecification contains a few unit tests that make sure
that the basic functionality works properly
Tracking of used names is a component needed by the name hashing
algorithm. The extraction and storage of used names is active only when
`AnalysisCallback.nameHashing` flag is enabled and it's disabled by
default.
This change constists of two parts:
1. Modification of Relations to include a new `names` relation
that allows us to track used names in Scala source files
2. Implementation of logic that extracts used names from Scala
compilation units (that correspond to Scala source files)
The first part is straightforward: add standard set of methods in
Relations (along with their implementation) and update the logic which
serializes and deserializes Relations.
The second part is implemented as tree walk that collects all symbols
associated with trees. For each symbol we extract a simple, decoded name
and add it to a set of extracted names. Check documentation of
`ExtractUsedNames` for discussion of implementation details.
The `ExtractUsedNames` comes with unit tests grouped in
`ExtractUsedNamesSpecification`. Check that class for details.
Given the fact that we fork while running tests in `compiler-interface`
subproject and tests are ran in parallel which involves allocating
multiple Scala compiler instances we had to bump the default memory limit.
This commit contains fixes for gkossakowski/sbt#3, gkossakowski/sbt#5 and
gkossakowski/sbt#6 issues.
Add documentation which explains how a general technique using implicits
conversions is employed in Compat class. Previously, it was hidden inside
of Compat class.
Also, I changed `toplevelClass` implementation to call
`sourceCompatibilityOnly` method that is designed for the purpose
of being a compatibility stub.
The scala/scala@2d4f0f1859 removes the
`toplevelClass` method. The recent change from
aac19fd02b introduces dependency on that
method. Combination of both changes makes incremental compiler incompatible
with Scala 2.11.
This change introduces a compatibility hack that brings back source
compatibility of incremental compiler with Scala 2.8, 2.9, 2.10 and 2.11.
The compatibility hack is making clever use implicit conversions that
can provide dummy method definitions for methods removed from Scala
compiler.
Also, the code that depends on `enclosingTopLevelClass` has been refactored
so the dependency is more centralized.
We introduced some new members (related to name hashing) with an intent
to not expose them as public API. However, I missed some modifiers and
some members (like `memberRef` and `inheritance`) are public.
This commit fixes access modifiers to agree with the intent.
The previous name of the flag was rather specific: it indicated
whether the new source dependency tracking is supported by given Relations
object. However, there will be more functionality added to Relations that
is specific to name hashing algorithm. Therefore it makes sense to name
the flag as just `nameHashing`.
I decided to rename Relations implementation classes to be more
consistent with the name of the flag and with the purpose they serve.
The flag in AnalysisCallback (and classes implementing it) has been
renamed as well.
The documentation of `Relations.inheritance` mentions an oddity of Scala's
type checker which manifests itself in what is being tracked by that
relation in case of traits being first parent for a class/trait.
Add a test case which verifies that this oddity actually exists and it's
not harmful because it doesn't break an invariant between `memberRef`
and `inheritance` relations.
Flip `memberRefAndInheritanceDeps` flag to true which allows us to
test `memberRef` and `inheritance` relations instead of `direct` and
`publicInherited` as it was previously done.
There a few changes to extracted dependencies from public members:
* F doesn't depend on C by inheritance anymore. The dependency on
C was coming from self type. This shows that dependencies from self
types are not considered to be dependencies introduces by inheritance
anymore.
* G depends on B by member reference now. This dependency is introduced
by applying type constructor `G.T` and expanding the result of the
application.
* H doesn't depend on D by inheritance anymore. That dependency was
introduced through B which inherits from D. This shows that only
parents (and not all base classes) are included in `inheritance`
relation.
NOTE: The second bullet highlights a bug in the old dependency tracking
logic. The dependency on B was recorded in `publicInherited` but not in
`direct` relation. This breaks the contract which says that
`publicInherited` is a subset of `direct` relation.
This a change to dependencies extracted from non-public members:
* C depends on A by inheritance and D depends on B by inheritance now;
both changes are of the same kind: dependencies introduced by
inheritance are tracked for non-public members now. This is necessary
for name hashing correctness algorithm
Add specs2 specification (unit test) which documents current dependency
extraction logic's behavior. It exercises `direct` and `publicInherited`
relations.
This test is akin to `source-dependencies/inherited-dependencies` scripted
test. We keep both because this test will diverge in next commit to test
`memberRef` and `inheritance` relations.
The idea behind adding this test and then modifying the
`memberRefAndInheritanceDeps` flag so we test `memberRef` and `inheritance`
is that we can show precisely the differences between those two dependency
tracking mechanisms.
Adding source dependency on itself doesn't really bring any value so
there's no reason to do it. We avoided recording that kind of dependencies
by performing a check in `AnalysisCallback` implementation. However, if we
have another implementation like `TestCallback` used for testing we do
not benefit from that check.
Therefore, the check has been moved to dependency phase were dependencies
are collected.
Add `extractDependenciesFromSrcs` method to ScalaCompilerForUnitTest
class which allows us to unit test dependency extraction logic.
See the comment attached to the method that explain the details of
how it should be used.
Refactor ScalaCompilerForUnitTesting by introducing a new method
`extractApiFromSrc` which better describes the intent than
`compileSrc`. The `compileSrc` becomes a private, utility method.
Also, `compileSrc` method changed it's signature so it can take
multiple source code snippets as input. This functionality will
be used in future commits.
Previously incremental compiler was extracting source code
dependencies by inspecting `CompilationUnit.depends` set. This set is
constructed by Scala compiler and it contains all symbols that given
compilation unit refers or even saw (in case of implicit search).
There are a few problems with this approach:
* The contract for `CompilationUnit.depend` is not clearly defined
in Scala compiler and there are no tests around it. Read: it's
not an official, maintained API.
* Improvements to incremental compiler require more context
information about given dependency. For example, we want to
distinguish between dependency on a class when you just select
members from it or inherit from it. The other example is that
we might want to know dependencies of a given class instead of
the whole compilation unit to make the invalidation logic more
precise.
That led to the idea of pushing dependency extracting logic to
incremental compiler side so it can evolve indepedently from Scala
compiler releases and can be refined as needed. We extract
dependencies of a compilation unit by walking a type-checked tree
and gathering symbols attached to them.
Specifically, the tree walk is implemented as a separate phase that
runs after pickler and extracts symbols from following tree nodes:
* `Import` so we can track dependencies on unused imports
* `Select` which is used for selecting all terms
* `Ident` used for referring to local terms, package-local terms
and top-level packages
* `TypeTree` which is used for referring to all types
Note that we do not extract just a single symbol assigned to `TypeTree`
node because it might represent a complex type that mentions
several symbols. We collect all those symbols by traversing the type
with CollectTypeTraverser. The implementation of the traverser is inspired
by `CollectTypeCollector` from Scala 2.10. The
`source-dependencies/typeref-only` test covers a scenario where the
dependency is introduced through a TypeRef only.
Introduce an alternative source dependency tracking mechanism that is
needed by upcoming name hashing algorithm. This new mechanism is
implemented by introducing two new source dependency relations called
`memberRef` and `inheritance`.
Those relations are very similar to existing `direct` and
`publicInherited` relations in some subtle ways. Those differences
will be highlighted in the description below.
Dependencies between source files are tracked in two distinct
categories:
* dependencies introduced by inheriting from a class/trait
defined in other source file
* dependencies introduced by referring (selecting) a member
defined in other source file (that covers all other
kinds of dependencies)
Due to invalidation algorithm implementation details sbt would need to
track inheritance dependencies of public classes only. Thus, we had
relation called `publicInherited`. The name hashing algorithm which
improves invalidation logic will need more precise information about
dependencies introduced by inheritance including dependencies of non-public
classes. That's one difference between `inheritance` and `publicInherited`
relations.
One surprising (to me) thing about `publicInherited` is that it includes
all base classes of a given class and not just parents. In that sense
`publicInherited` is transitive. This is a bit irregular because
everything else in Relations doesn't include transitive dependencies.
Since we are introducing new relations we have an excellent chance to
make things more regular. Therefore `inheritance` relation is
non-transitive and includes only extracted parent classes.
The access to `direct`, `publicInherited`, `memberRef` and `inheritance`
relations is dependent upon the value of `memberRefAndInheritanceDeps`
flag. Check documentation of that flag for details.
The two alternatives for source dependency tracking are implemented by
introduction of two subclasses that implement Relations trait and one
abstract class that contains some common logic shared between those two
subclasses. The two new subclasses are needed for the time being when we
are slowly migrating to the name hashing algorithm which requires
subtle changes to dependency tracking as explained above. For some time we
plan to keep both algorithms side-by-side and have a runtime switch which
allows to pick one. So we need logic for both old and new dependency
tracking to be available. That's exactly what two subclasses of
MRelationsCommon implement. Once name hashing is proven to be stable and
reliable we'll phase out the old algorithm and the old dependency tracking
logic.
The TestCaseGenerators uses global set for ensuring that certain generated
values are unique. This is not the best design because the more properties
you check the harder is to generate new sample inputs because of already
accumulated values. This results in:
[info] + Analysis.Simple Merge and Split: OK, proved property.
[info] ! Analysis.Complex Merge and Split: Gave up after only 8 passed tests. 93 tests were discarded.
I don't have an ambition to reduce the scope of this global set but at
least I wanted to make generators to work a bit harder on generating
samples.
Instead of using `suchThat` method for filtering out non-unique samples
we use `retryUntil` that never gives up (therefore it might not
terminate). We had to upgrade to latest (1.11.1) version of scalacheck
in order to have an access to `retryUntil` method.
Also, I overridden the `identifier` to delegate to original
`Gen.identifier` but with minimal size set to be to '3'. This means,
the generated identifier will be of size 3 or larger which is needed in
order to avoid collisions.
Reads/writes are a little faster with the text format,
and it's far more useful. E.g., it allows external manipulation
and inspection of the analysis.
We don't gzip the output. It does greatly shrink the files,
however it makes reads and writes 1.5x-2x slower, and we're
optimizing for speed over compactness.
It was an omission in the original commit that introduced them and didn't
mark them as private. They are purely an implementation detail and should
be hidden. We hiding them now.
Introduce a new incremental compiler option that controls
incremental compiler's treatment of macro definitions and their clients.
The current strategy is that whenever a source file containing a macro
definition is touched it will cause recompilation of all direct
dependencies of that file.
That strategy has proven to be too conservative for some projects like
Scala compiler of specs2 leading to too many source files being recompiled.
We make this behavior optional by introducing a new option
`recompileOnMacroDef` in `IncOptions` class. The default value is set to
`true` which preserves the previous behavior.
Add methods that allow one to set a new value to one of the fields of
IncOptions class. These methods are meant to be an alternative to
copy method that is hard to keep binary compatible when new fields are
added to the class.
Each copying method is related to one field of the class so when new
fields are added existing methods (and their signatures) are unaffected.
Expand case class `IncOptions` in binary compatible way so we can have
better control of methods like `unapply` when new fields are added.
Great precaution has been taken to ensure that this commit doesn't break
binary compatibility. I took a dump of javap output before and after
this change for both the class and it's companion object.
The diff is presented below:
diff -u ~/inc-options-before ~/inc-options-after
--- /Users/grek/inc-options-before 2013-11-03 14:48:45.000000000 +0100
+++ /Users/grek/inc-options-after 2013-11-03 15:53:10.000000000 +0100
@@ -9,7 +9,11 @@
public static java.lang.String transitiveStepKey();
public static sbt.inc.IncOptions setTransactional(sbt.inc.IncOptions, java.io.File);
public static sbt.inc.IncOptions defaultTransactional(java.io.File);
+ public static scala.Option unapply(sbt.inc.IncOptions);
+ public static sbt.inc.IncOptions apply(int, double, boolean, boolean, int, scala.Option, scala.Function0);
public static sbt.inc.IncOptions Default();
+ public static scala.Function1 tupled();
+ public static scala.Function1 curried();
public int transitiveStep();
public double recompileAllFraction();
public boolean relationsDebug();
diff -u inc-options-module-before inc-options-module-after
--- inc-options-module-before 2013-11-03 14:48:55.000000000 +0100
+++ inc-options-module-after 2013-11-12 21:00:41.000000000 +0100
@@ -3,6 +3,9 @@
public static final sbt.inc.IncOptions$ MODULE$;
public static {};
public sbt.inc.IncOptions Default();
+ public final java.lang.String toString();
+ public sbt.inc.IncOptions apply(int, double, boolean, boolean, int, scala.Option, scala.Function0);
+ public scala.Option unapply(sbt.inc.IncOptions);
public sbt.inc.IncOptions defaultTransactional(java.io.File);
public sbt.inc.IncOptions setTransactional(sbt.inc.IncOptions, java.io.File);
public java.lang.String transitiveStepKey();
@@ -13,7 +16,5 @@
public java.lang.String apiDiffContextSize();
public sbt.inc.IncOptions fromStringMap(java.util.Map);
public java.util.Map toStringMap(sbt.inc.IncOptions);
- public sbt.inc.IncOptions apply(int, double, boolean, boolean, int, scala.Option, scala.Function0);
- public scala.Option unapply(sbt.inc.IncOptions);
}
The first diff shows that there are just more static forwarders defined
for top-level companion object and that is binary compatible change.
The second diff shows that there are just a few minor differences in
order in which `unapply`, `apply` and bridge method for `apply` are
defined. Also, there's a new `toString` declaration. All those changes are
binary compatible.
All methods that are generated for a case class are marked as deprecated
and will be removed in the future.
The main motivation behind this commit is to reify information about
api changes that incremental compiler considers. We introduce a new
sealed class `APIChange` that has (at the moment) two subtypes:
* APIChangeDueToMacroDefinition - as the name explains, this represents
the case where incremental compiler considers an api to be changed
just because given source file contains a macro definition
* SourceAPIChange - this represents the case of regular api change;
at the moment it's just a simple wrapper around value representing
source file but in the future it will get expanded to contain more
detailed information about API changes (e.g. collection of changed
name hashes)
The APIChanges becomes just a collection of APIChange instances.
In particular, I removed `names` field that seems to be a dead code in
incremental compiler. The `NameChanges` class and methods that refer to
it in `SameAPI` has been deprecated.
The Incremental.scala has been adapted to changed signature of APIChanges
class. The `sameSource` method returns representation of APIChange
(if there's one) instead of just simple boolean. One notable change is
that information about APIChanges is pushed deeper into invalidation logic.
This will allow us to treat the APIChangeDueToMacroDefinition case properly
once name hashing scheme arrives.
This commit shouldn't change any behavior and is purely a refactoring.
The following events are logged:
* invalidation of source file due to macro definition
* inclusion of dependency invalidated by inheritance; we log both
nodes of dependency edge (dependent and dependency)
The second bullet helps to understand what's going on in case of
complex inheritance hierarchies like in Scala compiler.
The #958 describes a scenario where partially successful results are
produced in form of class files written to disk. However, if compilation
fails down the road we do not record any new compilation results (products)
in Analysis object. This leads to Analysis object and disk contents to get
out of sync.
One way to solve this problem is to use transactional ClassfileManager that
commits changes to class files on disk only when entire incremental
compilation session is successful. Otherwise, new class files are rolled
back to previous state.
The other way to solve this problem is to record time stamps of class files
in Analysis object. This way, incremental compiler can detect that class
files and Analysis object got out of sync and recover from that by
recompiling corresponding sources.
This commit uses latter solution which enables simpler (non-transactional)
ClassfileManager to handle scenario from #958.
Fixes#958
Fix the problem with unstable names synthesized for existential
types (declared with underscore syntax) by renaming type variables
to a scheme that is guaranteed to be stable no matter where given
the existential type appears.
The sheme we use are De Bruijn-like indices that capture both position
of type variable declarion within single existential type and nesting
level of nested existential type. This way we properly support nested
existential types by avoiding name clashes.
In general, we can perform renamings like that because type variables
declared in existential types are scoped to those types so the renaming
operation is local.
There's a specs2 unit test covering instability of existential types.
The test is included in compiler-interface project and the build
definition has been modified to enable building and executing tests
in compiler-interface project. Some dependencies has been modified:
* compiler-interface project depends on api project for testing
(test makes us of SameAPI)
* dependency on junit has been introduced because it's needed
for `@RunWith` annotation which declares that specs2 unit
test should be ran with JUnitRunner
SameAPI has been modified to expose a method that allows us to
compare two definitions.
This commit also adds `ScalaCompilerForUnitTesting` class that allows
to compile a piece of Scala code and inspect information recorded
callbacks defined in `AnalysisCallback` interface. That class uses
existing ConsoleLogger for logging. I considered doing the same for
ConsoleReporter. There's LoggingReporter defined which would fit our
usecase but it's defined in compile subproject that compiler-interface
doesn't depend on so we roll our own.
ScalaCompilerForUnit testing uses TestCallback from compiler-interface
subproject for recording information passed to callbacks. In order
to be able to access TestCallback from compiler-interface
subproject I had to tweak dependencies between interface and
compiler-interface so test classes from the former are visible in the
latter. I also modified the TestCallback itself to accumulate apis in
a HashMap instead of a buffer of tuples for easier lookup.
An integration test has been added which tests scenario
mentioned in #823.
This commit fixes#823.