Improve multi parser performance

The multi parser had very poor performance if there were many commands.
Evaluating the expansion of something like "compile;" * 30 could cause
sbt to hang indefinitely. I believe this was due to excessive
backtracking due to the optional `(parser <~ semi.?).?` part of the
parser in the non-leading semicolon case.

I also reworked the implementation so that the multi command now has a
name. This allows us to partition the commands into multi and non-multi
commands more easily in State while still having multi in the command
list. With this change, builds and plugins can exclude the multi parser
if they wish.

Using the partitioned parsers, I removed the high/priority low priority
distinction. Instead, I made it so that the multi command will actually
check if the first command is a named command, like '~'. If it is, it
will pass the raw command argument with the named command stripped out
into the parser for the named command. If that is parseable, then we
directly apply the effect. Otherwise we prefix each multi command to the
state.
This commit is contained in:
Ethan Atkins 2019-06-24 21:28:58 -07:00
parent 9cc8913131
commit 60b1ac7ac4
10 changed files with 97 additions and 45 deletions

View File

@ -157,39 +157,88 @@ object BasicCommands {
val nonSemi = charClass(_ != ';', "not ';'")
val semi = token(OptSpace ~> ';' ~> OptSpace)
val nonDelim = charClass(c => c != '"' && c != '{' && c != '}', label = "not '\"', '{', '}'")
val cmdPart = OptSpace ~> matched(
token(
(nonSemi & nonDelim).map(_.toString) | StringEscapable | braces('{', '}'),
hide = const(true)
).+
) <~ OptSpace
val strictParser: Option[Parser[String]] =
state.map(s => OptSpace ~> matched(s.nonMultiParsers) <~ OptSpace)
val parser = strictParser.map(sp => sp & cmdPart).getOrElse(cmdPart)
val components = ((nonSemi & nonDelim) | StringEscapable | braces('{', '}')).+
val cmdPart = matched(components)
val completionParser: Option[Parser[String]] =
state.map(s => OptSpace ~> matched(s.nonMultiParser) <~ OptSpace)
val cmdParser = completionParser.map(sp => sp & cmdPart).getOrElse(cmdPart).map(_.trim)
val multiCmdParser: Parser[String] = semi ~> cmdParser
/*
* There are two cases that need to be handled separately:
* 1) There are multiple commands separated by at least one semicolon with an optional
* leading semicolon.
* 2) There is a leading semicolon, but only on one command
* 1) leading semicolon with one or more commands separated by a semicolon
* 2) no leading semicolon and at least one command followed by a trailing semicolon
* and zero or more commands separated by a semicolon
* Both cases allow an optional trailing semi-colon.
*
* These have to be handled separately because the performance degrades badly if the first
* case is implemented with the following parser:
* (semi.? ~> ((combinedParser <~ semi).* ~ combinedParser <~ semi.?)
*/
(semi.? ~> (parser <~ semi).+ ~ (parser <~ semi.?).?).flatMap {
case (prefix, last) =>
(prefix ++ last).toList.map(_.trim).filter(_.nonEmpty) match {
case Nil => Parser.failure("No commands were parsed")
case cmds => Parser.success(cmds)
}
} | semi ~> parser.map(_.trim :: Nil) <~ semi.?
val noLeadingSemi = (cmdParser ~ (multiCmdParser.+ | semi.map(_ => Nil))).map {
case (prefix, last) => (prefix :: Nil ::: last.toList).filter(_.nonEmpty)
}
val leadingSemi = multiCmdParser.+.map(_.toList.filter(_.nonEmpty))
((leadingSemi | noLeadingSemi) <~ semi.?).flatMap {
case Nil => Parser.failure("No commands were parsed")
case commands => Parser.success(commands)
}
}
def multiParser(s: State): Parser[List[String]] = multiParserImpl(Some(s))
def multiApplied(s: State): Parser[() => State] =
Command.applyEffect(multiParser(s))(_ ::: s)
def multiApplied(state: State): Parser[() => State] =
Command.applyEffect(multiParserImpl(Some(state))) {
// the (@ _ :: _) ensures tail length >= 1.
case commands @ first :: (tail @ _ :: _) =>
require(first.head != ' ', s"Commands must be trimmed. Received: '$first'.")
// Note: scalafmt refuses to align on '*' using multiline /*...*/ here
//
// This case is only executed if the multi parser actually returns multiple commands to run.
// Otherwise we just prefix the single extracted command with the semicolon stripped to the
// state. Since there is no semicolon in the stripped command, the multi command parser will
// fail to parse that single command so we do not end up in a loop.
//
// If there are multiple commands, we give a named command a chance to parse the raw input
// and possibly directly evaluate the side effects. This is desirable if, for example,
// the command runs other commands. The continuous (~) command is one such example. If the
// input command is `~foo; bar`, the multi parser would extract "~foo" :: "bar" :: Nil.
// If we naively just prepended the state with both of these commands, it would run '~foo'
// and then when watch exited, it'd run 'bar', which is likely unexpected. To address this,
// we search for a named command whose name is a prefix of the first command in the list.
// In this case, we'd find '~' and then pass 'foo;bar' into its parser. If this succeeds,
// which it will in the case of continuous (so long as foo and bar are valid commands),
// then we directly evaluate the `() => State` returned by the parser. Otherwise, we
// fall back to prefixing the multi commands to the state.
//
state.nonMultiCommands.view.flatMap {
case command =>
command.nameOption match {
case Some(commandName) if first.startsWith(commandName) =>
// A lot of commands expect leading semicolons in their parsers. In order to
// ensure that they are multi-command capable, we strip off any leading spaces.
// Without doing this, we could run simple commands like `set` with the full
// input. This would likely fail because `set` doesn't know how to handle
// semicolons. This is a bit of a hack that is specifically there
// to handle `~` which doesn't require a space before its argument. Any command
// whose parser accepts multi commands without a leading space should be accepted.
// All other commands should be rejected. Note that `alias` is also handled by
// this case.
val commandArgs =
(first.drop(commandName.length).trim :: Nil ::: tail).mkString(";")
parse(commandArgs, command.parser(state)).toOption
case _ => None
}
case _ => None
}.headOption match {
case Some(s) => s()
case _ => commands ::: state
}
case commands => commands ::: state
}
def multi: Command = Command.custom(multiApplied, Help(Multi, MultiBrief, MultiDetailed))
val multi: Command =
Command.custom(multiApplied, Help(Multi, MultiBrief, MultiDetailed), Multi)
lazy val otherCommandParser: State => Parser[String] =
(s: State) => token(OptSpace ~> combinedLax(s, NotSpaceClass ~ any.*))

View File

@ -52,8 +52,11 @@ private[sbt] final class SimpleCommand(
private[sbt] final class ArbitraryCommand(
val parser: State => Parser[() => State],
val help: State => Help,
val tags: AttributeMap
val tags: AttributeMap,
override val nameOption: Option[String]
) extends Command {
def this(parser: State => Parser[() => State], help: State => Help, tags: AttributeMap) =
this(parser, help, tags, None)
def tag[T](key: AttributeKey[T], value: T): ArbitraryCommand =
new ArbitraryCommand(parser, help, tags.put(key, value))
}
@ -124,6 +127,8 @@ object Command {
def customHelp(parser: State => Parser[() => State], help: State => Help): Command =
new ArbitraryCommand(parser, help, AttributeMap.empty)
private[sbt] def custom(parser: State => Parser[() => State], help: Help, name: String): Command =
new ArbitraryCommand(parser, const(help), AttributeMap.empty, Some(name))
def custom(parser: State => Parser[() => State], help: Help = Help.empty): Command =
customHelp(parser, const(help))

View File

@ -41,25 +41,10 @@ final case class State(
currentCommand: Option[Exec],
next: State.Next
) extends Identity {
/*
* The `~` and `alias` commands effectively run other commands so they need to be run before
* the multi parser. For example, if the user runs `~foo;bar` and the multi parser runs before
* the `~` parser, then then it will be parsed as two commands `~foo` and `bar`. By running
* the high priority commands before the multi parser, we ensure that the high priority commands
* parse the full command input. Any other command that runs other commands would likely need
* to be added to this list but at the time of writing this comment {~, alias} are the two
* commands that we know need this special treatment.
*
* TODO: add a structured way of indicating that a command needs to run before the multi parser.
*/
private[this] val highPriorityCommands = Set("~", "alias")
private[this] lazy val (highPriority, regularPriority) =
definedCommands.partition(_.nameOption.exists(highPriorityCommands))
private[this] lazy val highPriorityParser = Command.combine(highPriority)(this)
private[this] lazy val lowPriorityParser = Command.combine(regularPriority)(this)
private[sbt] lazy val nonMultiParsers = highPriorityParser | lowPriorityParser
lazy val combinedParser =
highPriorityParser | BasicCommands.multiApplied(this) | lowPriorityParser
private[sbt] lazy val (multiCommands, nonMultiCommands) =
definedCommands.partition(_.nameOption.contains(BasicCommandStrings.Multi))
private[sbt] lazy val nonMultiParser = Command.combine(nonMultiCommands)(this)
lazy val combinedParser = multiCommands.foldRight(nonMultiParser)(_.parser(this) | _)
def source: Option[CommandSource] =
currentCommand match {

View File

@ -190,6 +190,7 @@ object BuiltinCommands {
def DefaultCommands: Seq[Command] =
Seq(
multi,
about,
tasks,
settingsCommand,

View File

@ -146,9 +146,9 @@ private[sbt] object Continuous extends DeprecatedContinuous {
val ocp = BasicCommands.multiParserImpl(Some(state)) |
BasicCommands.otherCommandParser(state).map(_ :: Nil)
(digitParser.? ~ ocp).flatMap {
case (i, cmds) if cmds.exists(_.nonEmpty) =>
Parser.success((i.getOrElse(0), cmds.filter(_.nonEmpty)))
case (_, cmds) => Parser.failure("Couldn't parse any commands")
case (i, commands) if commands.exists(_.nonEmpty) =>
Parser.success((i.getOrElse(0), commands.filter(_.nonEmpty)))
case (_, _) => Parser.failure("Couldn't parse any commands")
}
}

View File

@ -18,3 +18,5 @@ checkInput := checkInputImpl.evaluated
val dynamicTask = taskKey[Unit]("dynamic input task")
dynamicTask := { println("not yet et") }
crossScalaVersions := "2.11.12" :: "2.12.8" :: Nil

View File

@ -0,0 +1 @@
class Foo

View File

@ -34,3 +34,7 @@
> checkInput foo
> compile; checkInput foo
> ++ 2.11.12 compile; setStringValue bar; checkStringValue bar
> ++2.12.8 compile; setStringValue foo; checkStringValue foo

View File

@ -19,3 +19,6 @@
> ~; compile; setStringValue string.txt baz; checkStringValue string.txt baz
# Ensure that trailing semi colons work
> ~ compile; setStringValue string.txt baz; checkStringValue string.txt baz;
# Ensure that no space is required between '~' and the command
> ~compile;setStringValue string.txt foo;checkStringValue string.txt foo

View File

@ -185,6 +185,8 @@ final class ScriptedTests(
case "actions/cross-multiproject" => LauncherBased // tbd
case "actions/cross-multi-parser" =>
LauncherBased // java.lang.ClassNotFoundException: javax.tools.DiagnosticListener when run with java 11 and an old sbt launcher
case "actions/multi-command" =>
LauncherBased // java.lang.ClassNotFoundException: javax.tools.DiagnosticListener when run with java 11 and an old sbt launcher
case "actions/external-doc" => LauncherBased // sbt/Package$
case "actions/input-task" => LauncherBased // sbt/Package$
case "actions/input-task-dyn" => LauncherBased // sbt/Package$