print · source · login   

Evaluation criteria

We have many different criteria for evaluating the performance of algorithms and tools for learning and testing:

  • Does the algorithm/tool aim at fully learning the benchmark or just at giving suitable aggregate information of data that have been gathered?
  • Number of inputs events required for learning/testing model
  • Number of test sequences required for learning/testing model
  • Wall clock time needed for learning/testing (reset or certain inputs may require lot of time)
  • Quality of intermediate hypothesis; how long it takes before you get first reasonable model?
  • How interpretable are the results? (e.g. is tool able to discover structure, e.g. hierarchy and parallel composition, and are generated counterexamples minimal)
  • How easy it is to parallelize learning/testing