wiki:WebValidation

Version 13 (modified by Neil Christensen, 14 years ago) ( diff )

--

Automatic Validation Package from the Web

  • Description: A web site is being developed that will allow a user to upload their model files and "stock" version files. It will run "sanity" tests on them and allow the user to run various processes on them and compare between versions and between MEG's. It will be much easier to use than the Mathematica validations we have been running. Also, the user will not have to install all the MEG's. That will be taken care of by the web validation maintainers. In the first version, I envision it being useful to us (FR developers). In a later version (next year), I envision it being ready for the public and possibly part of a "model database".
  • Names: Neil, Claude, Benj, Christian.
  • Status: 50% of essential developer required code done.
  • Done:
    • Full FR model can be uploaded.
    • User can choose current or development FR. Currently, only development works. "Current" ready for next release.
    • Site creates dir for model, stores model files there in tar.gz format.
    • Site creates entries in database for model.
    • Site starts condor jobs to:
      • Load model files
      • Load restriction files.
      • Run CH, FA, MG, SH, WO interfaces.
      • Run CH & MG on resulting code.
      • Each of these condor jobs relies on previous ones. If a necessary previous step fails, then the next job is not run. This is controlled by a "DAG" script.
    • If 1 or more MEG passes, the user is allowed to run 2->2 cs processes.
      • They can choose the MEG they want to test (among the ones that pass the previous tests).
      • User can choose "restrictions" on 2->2 processes.
        • If no restrictions are chosen, then _all_ 2->2 processes are generated.
        • User can restrict to certain kinds of fields (fermions, vectors,...)
        • User can restrict on indices and charges.
    • After processes are generated, user can run them.
      • The processes are run on Condor.
        • Currently 4 parallel nodes. Seems pretty fast.
        • Adds 10 processes to the queue at a time. This allows multiple users and validations to be run at the same time and make progress at the same time.
        • User can refresh their browser to see the progress. Each process and MEG that is finished will show up as they are finished.
      • User can rerun processes.
    • User can create multiple validations.
    • Multiple validations can be run concurrently. Condor handles this.
    • Multiple users can run at the same time. Condor handles this.
    • Users can delete models. (But only when no condor jobs are running for this model.)
    • Users can delete validations. (But only when this validation is not being run.)
    • Security:
      • All user code is run on condor jobs and no where else.
      • All mysql operations are run on head node as POST scripts and are not mingled with user code. This allows all network connections to be closed on condor jobs (except those initiated by condor).
      • Home dir has been unmounted from condor jobs. So, user code does not have access to any dir other than in their virtual env.
      • Users do not control condor. They can only use web form to start jobs and see the outcome.
  • To-do (for this summer):
    • Get FR-SH and FR-WO working. (Christian Speckner is helping with WO). This shouldn't be difficult. The scaffolding is there.
    • Test whether a Mathematica license is available. If not, wait and try again.
    • Use parameter files in tests.
    • Allow upload and comparison with "stock" versions of models.
    • Improve security:
      • Test that nothing can be written outside of condor job dir.
      • Test that condor job cannot do any network connections _except_ certain database updates.
      • Control time, memory and disk space available to condor jobs (admin side).
    • Versions of all software used (partly done).
  • To-do (not gauranteed for this summer but hopefully by summer of 2011):
    • Add links moving around model database.
    • Add history (if model changes, then keep the old model with versioning. Do the same for validations, etc..)
    • Finish other pages of model database web site. Login, etc...
    • Allow the user to modify part of their model (currently have to delete model and start over).
      • Intelligently only run pieces that are modified.
    • Allow user to stop running Condor jobs.
    • Support phase space tests.
    • Support 1->2 decay tests.
    • Support 1->3 decay tests.
    • Support 2->3 processes?
    • Support other model info of interest. Authors, urls, ...
    • Connect username and password with wiki username and password?
    • Set up other Condor nodes at other institutions to run jobs.
      • LLN would be the main node which would host the website and submit the jobs to Condor.
      • Condor would run jobs on one of the nodes which would exist at:
      • LLN
      • Maybe UW?
      • Maybe Strasbourg?
    • Add admin pages for...
Note: See TracWiki for help on using the wiki.