%0 Conference Proceedings %B Proceedings of the 52nd Hawai'i International Conference on System Sciences (HICSS-52) %D 2019 %T Coordination in OSS 2.0: ANT Approach %A Sangseok You %A Kevin Crowston %A Jeffery Saltz %A Yatish Hegde %K actor-network theory %K free/libre open source %K Stigmergy %X

Open source software projects are increasingly driven by a combination of independent and professional developers, the former volunteers and the later hired by a company to contribute to the project to support commercial product development. This mix of developers has been referred to as OSS 2.0. However, we do not fully understand the multi-layered coordination spanning individuals, teams, and organizations. Using Actor-Network Theory (ANT), we describe how coordination and power dynamics unfold among developers and how different tools and artifacts both display activities and mediate coordination efforts. Internal communication within an organization was reported to cause broken links in the community, duplication of work, and political tensions. ANT shows how tools and code can exercise agency and alter a software development process as an equivalently active actor of the scene. We discuss the theoretical and practical implications of the changing nature of open source software development.

%B Proceedings of the 52nd Hawai'i International Conference on System Sciences (HICSS-52) %G eng %U http://hdl.handle.net/10125/59538 %R 10.24251/HICSS.2019.120 %> https://crowston.syr.edu./sites/crowston.syr.edu/files/hicss52a-sub2136-cam-i8-2.pdf %0 Conference Proceedings %B Proceedings of the 52nd Hawai'i International Conference on System Sciences (HICSS-52) %D 2019 %T Helping data science students develop task modularity %A Jeffery Saltz %A Heckman, Robert %A Kevin Crowston %A Sangseok You %A Yatish Hegde %K data science %K modularity %K Stigmergy %X

This paper explores the skills needed to be a data scientist. Specifically, we report on a mixed method study of a project-based data science class, where we evaluated student effectiveness with respect to dividing a project into appropriately sized modular tasks, which we termed task modularity. Our results suggest that while data science students can appreciate the value of task modularity, they struggle to achieve effective task modularity. As a first step, based our study, we identified six task decomposition best practices. However, these best practices do not fully address this gap of how to enable data science students to effectively use task modularity. We note that while computer science/information system programs typically teach modularity (e.g., the decomposition process and abstraction), and there remains a need identify a corresponding model to that used for computer science / information system students, to teach modularity to data science students.

%B Proceedings of the 52nd Hawai'i International Conference on System Sciences (HICSS-52) %G eng %U http://hdl.handle.net/10125/59549 %R 10.24251/HICSS.2019.134 %> https://crowston.syr.edu./sites/crowston.syr.edu/files/modularity-HICSS-final-afterReview.pdf %0 Journal Article %J Proceedings of the ACM %D 2019 %T Socio-technical affordances for stigmergic coordination implemented in MIDST, a tool for data-science teams %A Kevin Crowston %A Jeffery Saltz %A Amira Rezgui %A Yatish Hegde %A Sangseok You %K stigmergic coordination; translucency; awareness; data-science teams %X

We present a conceptual framework for socio-technical affordances for stigmergic coordination, that is, coordination supported by a shared work product. Based on research on free/libre open source software development, we theorize that stigmergic coordination depends on three sets of socio-technical affordances: the visibility and combinability of the work, along with defined genres of work contributions. As a demonstration of the utility of the developed framework, we use it as the basis for the design and implementation of a system, MIDST, that supports these affordances and that we thus expect to support stigmergic coordination. We describe an initial assessment of the impact of the tool on the work of project teams of three to six data-science students that suggests that the tool was useful but also in need of further development. We conclude with plans for future research and an assessment of theory-driven system design.

%B Proceedings of the ACM %V 3 %P Article 117 %G eng %N CSCW %R 10.1145/3359219 %> https://crowston.syr.edu./sites/crowston.syr.edu/files/cscw117-crowstonA.pdf %0 Conference Paper %B Workshop on Interactive Language Learning, Visualization, and Interfaces, 52nd Annual Meeting of the Association for Computational Linguistics %D 2014 %T Design of an Active Learning System with Human Correction for Content Analysis %A Jasy Liew Suet Yan %A McCracken, Nancy %A Kevin Crowston %X Our research investigation focuses on the role of humans in supplying corrected examples in active learning cycles, an important aspect of deploying active learning in practice. In this paper, we discuss sampling strategies and sampling sizes in setting up an active learning system for human experiments in the task of content analysis, which involves labeling concepts in large volumes of text. The cost of conducting comprehensive human subject studies to experimentally determine the effects of sampling sizes and sampling sizes is high. To reduce those costs, we first applied an active learning simulation approach to test the effect of different sampling strategies and sampling sizes on machine learning (ML) performance in order to select a smaller set of parameters to be evaluated in human subject studies. %B Workshop on Interactive Language Learning, Visualization, and Interfaces, 52nd Annual Meeting of the Association for Computational Linguistics %C Baltimore, MD %8 06/2014 %> https://crowston.syr.edu./sites/crowston.syr.edu/files/ILLWorkshop.ACLFormat.04.28.14.final_.pdf %0 Conference Paper %B Workshop on Language Technologies and Computational Social Science, 52nd Annual Meeting of the Association for Computational Linguistics %D 2014 %T Optimizing Features in Active Machine Learning for Complex Qualitative Content Analysis %A Jasy Liew Suet Yan %A McCracken, Nancy %A Shichun Zhou %A Kevin Crowston %X We propose a semi-automatic approach for content analysis that leverages machine learning (ML) being initially trained on a small set of hand-coded data to perform a first pass in coding, and then have human annotators correct machine annotations in order to produce more examples to retrain the existing model incrementally for better performance. In this “active learning” approach, it is equally important to optimize the creation of the initial ML model given less training data so that the model is able to capture most if not all positive examples, and filter out as many negative examples as possible for human annotators to correct. This paper reports our attempt to optimize the initial ML model through feature exploration in a complex content analysis project that uses a multidimensional coding scheme, and contains codes with sparse positive examples. While different codes respond optimally to different combinations of features, we show that it is possible to create an optimal initial ML model using only a single combination of features for codes with at least 100 positive examples in the gold standard corpus. %B Workshop on Language Technologies and Computational Social Science, 52nd Annual Meeting of the Association for Computational Linguistics %C Baltimore, MD %8 06/2014 %> https://crowston.syr.edu./sites/crowston.syr.edu/files/9_Paper.pdf %0 Conference Paper %B iConference %D 2014 %T Semi-Automatic Content Analysis of Qualitative Data %A Jasy Liew Suet Yan %A McCracken, Nancy %A Kevin Crowston %B iConference %C Berlin, Germany %8 03/2014 %> https://crowston.syr.edu./sites/crowston.syr.edu/files/iConference_Poster_Published.pdf