Gestalt: Integrated Support for Implementation and Analysis in Machine Learning
Authors:
Kayur Patel - University of Washington, Seattle, WA, USA
Naomi Bancroft - University of Washington, Seattle, WA, USA
Steven M. Drucker - Microsoft Research, Seattle, WA, USA
James Fogarty - University of Washington, Seattle, WA, USA
Andrew J. Ko - University of Washington, Seattle, WA, USA
James Landay - University of Washington, Seattle, WA, USA
Released:
UIST '10 Proceedings of the 23nd annual ACM symposium on User interface software and technology
Summary
Debugging for machine learning is a difficult problem. There is too much complex information to successfully debug a program using print statements, but showing the data visually can be difficult. Gestalt is an API that allows for machine learning problems to be cataloged during creation inside a database and shown graphically. This allows for easy switching between coding and analysis.
Hypothesis
The researchers want to test whether their tool for analyzing machine learning problems is more efficient than not.
Methods
The way they approached the problem was to allow data to be put into a database alongside the coding process. A algorithm pipeline could be traced using data tagged with a "Generate Attribute" menu. Here, correctness and valuation of data can be quantified.
Two sample machine learning problems were attacked:
Results
All subjects found or solved more issues in the same amount of time compared to not using Gestalt.
Discussion
I am convinced this is a helpful tool, however I am not convinced on how often I would use it. This puts its significance on the line. I don't think there are any faults in this paper since they set out in making a tool better than what exists, and they succeeded. Machine learning is a large subject in AI, and will probably be much more prevalent in the future.
UIST '10 Proceedings of the 23nd annual ACM symposium on User interface software and technology
Summary
Debugging for machine learning is a difficult problem. There is too much complex information to successfully debug a program using print statements, but showing the data visually can be difficult. Gestalt is an API that allows for machine learning problems to be cataloged during creation inside a database and shown graphically. This allows for easy switching between coding and analysis.
Hypothesis
The researchers want to test whether their tool for analyzing machine learning problems is more efficient than not.
Methods
The way they approached the problem was to allow data to be put into a database alongside the coding process. A algorithm pipeline could be traced using data tagged with a "Generate Attribute" menu. Here, correctness and valuation of data can be quantified.
Two sample machine learning problems were attacked:
- Determining positive or negative tones from text
- Determining shapes from gestures
Gestalt assists the first problem by allowing literal correct/incorrect responses from the user. This allows misclassified texts to be easily shown in a visually dense manner. The second problem can be shown more visibly since it is a visual problem. From a general view of all the classifications, it is more simple to determine where errors exist, such as triangles being confused with squares. 8 participants with knowledge of machine learning problems and use of python were chosen to test Gestalt. The study involved injecting bugs into working code and timing the participants on how long it takes to debug using generic tools compared to Gestalt.
Results
All subjects found or solved more issues in the same amount of time compared to not using Gestalt.
Discussion
I am convinced this is a helpful tool, however I am not convinced on how often I would use it. This puts its significance on the line. I don't think there are any faults in this paper since they set out in making a tool better than what exists, and they succeeded. Machine learning is a large subject in AI, and will probably be much more prevalent in the future.
No comments:
Post a Comment