- remove unused type params
- use `withFilter` if possible
- use `collectFirst` instead of `collect` and `headOption`
- use `length` instead of `size` if `Array` or `String`
- use `foreach` instead of `map`
This implements Selective functor for `Either[A, B]` "task" (`Initialize[Task[Either[A, B]]]`).
The selective functor allows an encoding of if-expression:
```
def ifS[A](
x: Def.Initialize[Task[Boolean]]
)(t: Def.Initialize[Task[A]])(e: Def.Initialize[Task[A]]): Def.Initialize[Task[A]]
```
The benefit of this approach is that task dependencies are still visible to inspect command.
To demonstrate [-Yno-lub](http://eed3si9n.com/stricter-scala-with-ynolub), this shows the code changes that removes lubing (Not all subprojects are done).
After I made the changes, I switched the Scala back to normal 2.12.10.
We have seen failures in scripted to create the output file for streams
and it has also been reported in https://github.com/sbt/sbt/issues/5067.
I believe this may caused by the same stream output being initialized by
multiple tasks. To fix this, I add locking on a per-file basis. There
was (and is) additional synchronization on the Streams _instance_, but
the per-file locks are stored in the Streams companion object so the
locking should be honored no matter which Streams instance calls make.
I also skip the call to IO.touch if the file already exists. I believe
that this is the common case and since IO.touch was being called with
setModified = false, it should be fine to skip the touch when the file
exists.
Prior to this change, I was able to induce the issue in #5067 in roughly
1/50 of calls to `scripted actions/cross-multiproject`. I wasn't able to
reproduce after this change.
It can be quite slow to read and parse a large json file. Often, we are
reading and writing the same file over and over even though it isn't
actually changing. This is particularly noticeable with the
UpdateReport*. To speed this up, I introduce a global cache that can be
used to read values from a CacheStore. When using the cache, I've seen
the time for the update task drop from about 200ms to about 1ms. This
ends up being a 400ms time savings for test because update is called for
both Compile / compile and Test / compile.
The way that this works is that I add a new abstraction
CacheStoreFactoryFactory, which is the most enterprise java thing I've
ever written. We store a CacheStoreFactoryFactory in the sbt State.
When we make Streams for the task, we make the Stream's
cacheStoreFactory field using the CacheStoreFactoryFactory. The
generated CacheStoreFactory may or may not refer to a global cache.
The CacheStoreFactoryFactory may produce CacheStoreFactory instances
that delegate to a Caffeine cache with a max size parameter that is
specified in bytes by the fileCacheSize setting (which can also be set
with -Dsbt.file.cache.size). The size of the cache entry is estimated by
the size of the contents on disk. Since we are generally storing things
in the cache that are serialized as json, I figure that this should be a
reasonable estimate. I set the default max cache size to 128MB, which is
plenty of space for the previous cache entries for most projects. If the
size is set to 0, the CacheStoreFactoryFactory generates a regular
DirectoryStoreFactory.
To ensure that the cache accurately reflects the disk state of the
previous cache (or other cache's using a CacheStore), the Caffeine cache
stores the last modified time of the file whose contents it should
represent. If there is a discrepancy in the last modified times (which
would happen if, say, clean has been run), then the value is read from
disk even if the value hasn't changed.
* With the following build.sbt file, it takes roughly 200ms to read and
parse the update report on my compute:
libraryDependencies += "org.apache.spark" %% "spark-sql" % "2.4.3"
libraryDependencies += "org.scalatest" %% "scalatest" % "3.0.1"
This is because spark-sql has an enormous number of dependencies and the
update report ends up being 3MB.