→ something that we advise our clients on all the time, and is a major portion that I think takes people by surprise sometimes, is that most organizations is that their default is to treat their data science projects like software engineering projects that they’re currently running at the organization
Nov 22, 2018
highlights & Science
something that we advise our clients on all the time, and is a major portion that I think takes people by surprise sometimes, is that most organizations is that their default is to treat their data science projects like software engineering projects that they’re currently running at the organization. So if they want their data scientists to be filling out Jira tickets and have Sprints. Not only the data scientists, but data engineering is not a similar task like that either. And the platform architecture too, is similar. They all share something in common. in data science, data engineering, and platform architecture, it’s one of those things where you can spend forever on something and it won’t be done. So, it’s all about, “When do I feel like stopping?” Or, “When do I run out of money?” Rather than, “Okay, this application is done. I’ll ship it, it’s in a box. It’s all good to go. We release it to the world and we sell it. It’s great.” On the data science side it’s hard to tell how long something’s going to take until you do it. So there’s this chicken and egg problem. I can’t write the Jira ticket it’s going to take two weeks, until I actually spend the two weeks to do it, and realize it’s actually going to take four weeks. And so when you try to apply these traditional software engineering project management things on these projects it doesn’t work. It actually causes harm in a lot of cases….there’s actually a new discipline that needs to arise. — https://blog.dominodatalab.com/collaboration-data-science-data-engineering-true-false/
→ I’ve actually heard a project manager say, “You know, any line of code that my developers write to audit what they’re doing, to put stuff in a database, is a line of code that they’re not putting in developing the application
Nov 21, 2018
highlights & data & Science
I’ve actually heard a project manager say, “You know, any line of code that my developers write to audit what they’re doing, to put stuff in a database, is a line of code that they’re not putting in developing the application.” And so they frequently encourage a huge technical debt as they’ve got this great application now, but when it comes time for phase two of the project, to do something interesting with the data that this application should have stored somewhere but didn’t, we’re kind of left holding the bag because the application developers were kind of short sighted. And to my mind this is the kind of short term thinking that hinders really good data science. — https://blog.dominodatalab.com/collaboration-data-science-data-engineering-true-false/
→ [it] is a symptom of really bad project management
Nov 21, 2018
highlights & Science
[it] is a symptom of really bad project management. It seems to me that the way to solve this problem is to have everybody in the room when the project is being designed … It’s sort of like life insurance. You know, you don’t really need it until you need it, but you’ve got to keep having it, even when you don’t need it. The projects that I’ve seen that have been most successful are the projects in which the data scientists, the data engineers, and… the application developers are all there in the room from the beginning, with the customer talking about what the problem is they want to solve, what a minimal product is, what the final solution should be, what the users expect out of this. And if you start from that place you’re much more likely to get empathy. …That’s the first thing. — https://blog.dominodatalab.com/collaboration-data-science-data-engineering-true-false/
→ in our consulting engagements, and also two other data science consulting companies that I know and work with, if we have a pure play data science project, meaning that the data engineering’s not in scope, the customer said that they were going to take care of it, we won’t start work until we have proof that the data’s been loaded
Nov 20, 2018
highlights & Science
in our consulting engagements, and also two other data science consulting companies that I know and work with, if we have a pure play data science project, meaning that the data engineering’s not in scope, the customer said that they were going to take care of it, we won’t start work until we have proof that the data’s been loaded. We’ve been burned so many times by them saying like, “Oh, you know what? You guys can start on Monday. We’ll get the data loaded sometimes next week.” We’re not even going to start until that data’s there….that’s the other issue too with the data engineer. I actually ran into this issue….on the younger side of the data engineers, one of the issues that we run into is that they don’t have the seniority to stand up to some ancient Oracle DBA that’s not willing to play nice. …it’s a really hard role to fill because, you’re right,… the interpersonal skills, and the political navigation skills are really important for the data engineer. — https://blog.dominodatalab.com/collaboration-data-science-data-engineering-true-false/
→ There is a seemingly myriad of terms to describe people who interact with models
Nov 19, 2018
highlights & Learning & Science
There is a seemingly myriad of terms to describe people who interact with models. Just a few terms that are currently in usage include researchers, data scientists, machine learning researchers, machine learning engineers, data engineers, infrastructure engineers, DataOps, DevOps, etc. Both Miner and Presser commented upon and agreed that before any assignment of any term, the work itself existed previously. Presser defines data engineering as embodying the skills to obtain data, build data stores, manage data flows including ETL, and provide the data to data scientists for analysis. Presser also indicated that data engineers at large enterprise organizations also have to be well versed in “cajoling” data from departments that may not, at first glance, provide it. Miner agreed and indicated that there is more thought leadership around the definition of data science versus data engineering which contributes to the ambiguity within the market. — https://blog.dominodatalab.com/collaboration-data-science-data-engineering-true-false/
→ Over the past five years, we have heard many stories from data science teams about their successes and challenges when building, deploying, and monitoring models
Nov 19, 2018
highlights & Science
Over the past five years, we have heard many stories from data science teams about their successes and challenges when building, deploying, and monitoring models. Unfortunately, we have also heard that many companies have internalized the model myth, or the misconception that data science should be treated like software development or data assets. This misconception is completely understandable. Data science involves code and data. Yet, people leverage data science to discover answers to previously unsolvable questions. As a result, data science work is more experimental, iterative, and exploratory than software development. Data science work involves computationally intensive algorithms that benefit from scalable compute and sometimes requires specialized hardware like GPUs. Data science work also requires data, a lot more data than typical software products require. All of these needs (and more) highlight how data science work differs from software development. These needs also highlight the vital importance of collaboration between data science and engineering, particularly for innovative model-driven companies seeking to maintain or grow their competitive advantage.
Yet, collaboration between data science and engineering is a known challenge. — https://blog.dominodatalab.com/collaboration-data-science-data-engineering-true-false/
→ Over the past five years, we have heard many stories from data science teams about their successes and challenges when building, deploying, and monitoring models
Nov 19, 2018
highlights & Science
Over the past five years, we have heard many stories from data science teams about their successes and challenges when building, deploying, and monitoring models. Unfortunately, we have also heard that many companies have internalized the model myth, or the misconception that data science should be treated like software development or data assets. This misconception is completely understandable. Data science involves code and data. Yet, people leverage data science to discover answers to previously unsolvable questions. As a result, data science work is more experimental, iterative, and exploratory than software development. Data science work involves computationally intensive algorithms that benefit from scalable compute and sometimes requires specialized hardware like GPUs. Data science work also requires data, a lot more data than typical software products require. All of these needs (and more) highlight how data science work differs from software development. These needs also highlight the vital importance of collaboration between data science and engineering, particularly for innovative model-driven companies seeking to maintain or grow their competitive advantage. — https://blog.dominodatalab.com/collaboration-data-science-data-engineering-true-false/
→ The disparity in ways that file selection is presented or obstructed in apps is bewildering and frustrating.
Nov 17, 2018
highlights
”The only way apps should be doing it currently is with iOS 11 style file APIs, but many apps have either legacy file solutions, bespoke (ie, confusingly differentâââand differently-abled) file pickers or would rather pull you into their own cloud platform. The disparity in ways that file selection is presented or obstructed in apps is bewildering and frustrating. It forces me to become the expert on file picker UI and capabilities from app to appâââtime consuming, pointless knowledge that should be learnt once at a platform level. Of the legacy methods; copying/sharing from iTunes is prehistoric, duplicating files between apps is barbaric, and sharing files between apps with WebDAV is soporific. If you have better options, use them. Many apps offer some specific integration with a cloud serviceâââsome to widen your options, others to steer you to their own eco-system.”
â https://medium.com/@pixelthing/fe-webdev-on-ipad-pro-2018-c55283f01e4c
Well put. Some file organizing difficulties are so obtuse that I wonder if developers use their own apps.
→ One point of discussion that researchers across many different strands of visualization researchers agreed upon was the visualization research is not as visible as it should be to the many people designing, critiquing, and otherwise using visualization in the world
Nov 15, 2018
highlights
One point of discussion that researchers across many different strands of visualization researchers agreed upon was the visualization research is not as visible as it should be to the many people designing, critiquing, and otherwise using visualization in the world. This includes data scientists, information designers, business analysts, researchers in other fields, and numerous others. Some aspects of this disconnect are institutional — our research is often buried behind paywalls. But we suspect that a large portion of it is also due to the way that visualization researchers, like many scientists, are not always motivated or skilled in communicating their research to a broader audience. — https://medium.com/multiple-views-visualization-research-explained/why-we-should-be-talking-about-visualization-research-4d28a2032de3?source=rss—-304d1be73000—4
→ One theme was the importance of ensuring that policymakers are provided with research syntheses and summaries that match their needs
Nov 13, 2018
highlights & Health
One theme was the importance of ensuring that policymakers are provided with research syntheses and summaries that match their needs. This was the lesson from a study from Monash University, Australia, that interviewed 43 policymakers on a strategy aimed at supporting the use of systematic reviews in health policy. The policymakers in this study overwhelmingly agreed that research summaries were critical to increase the use of research. The study demonstrated a need for layered or “graded-entry” formats, ranging from short summaries all the way up to detailed reports. It showed the need for a mechanism to assess policymakers’ changing research needs so they could be targeted with a relevant synthesis.
This was also the message from a 2011 study from the US, which tested four different types of policy briefs on mammography screening with nearly 300 policymakers. The study authors found different types of policymakers tended to like different types of briefs, with staffers more likely to report a preference for story-focused briefs and legislators preferring data-focused briefs.
Another theme to emerge was the need for better collaboration between researchers and policymakers, and for the two to build closer relationships. One large study, involving nearly 1,000 policymakers, looked at an intervention where researchers presented their findings directly to policymakers in either traditional seminars or interactive roundtables. Policymakers agreed that such presentations stimulated their thinking, that the interactive roundtables were more relevant to their needs compared with traditional seminars, and that the new knowledge could be used in their work as policymakers.
Three of the studies under review focused on increasing organisational capacity to use research. A Canadian study looked at a scheme to improve implementation of best practice guidelines in health promotion programmes using a team of “organisational champions”, while a Dutch study explored the use of masterclasses for public health professionals and policymakers supporting a practice-based research project. — http://blogs.lse.ac.uk/impactofsocialsciences/2018/11/12/less-than-5-of-papers-on-the-use-of-research-in-health-policymaking-tested-interventions-to-see-what-worked-but-those-studies-reveal-a-number-of-strategies-for-improvement/
→ The first thing to note is that, although we identified over 300 papers on the use of research in health policymaking, the vast majority of these were descriptive
Nov 12, 2018
highlights & Health
The first thing to note is that, although we identified over 300 papers on the use of research in health policymaking, the vast majority of these were descriptive. Very few – in fact just 14 of 304 articles – actually concerned testing interventions to see whether they worked. There is a serious discrepancy, therefore, between surging interest in this area and the small number of studies actually testing strategies.
The 14 articles we did find (reporting on 13 intervention strategies) tended to be methodologically weak. Only one study used an experimental design, while one other used a pre/post-test design. The others used a range of approaches and were characterised by an absence of control groups, small sample sizes, and self-report data. Most measured outcomes related to factors that influence research use rather than actual research use. — http://blogs.lse.ac.uk/impactofsocialsciences/2018/11/12/less-than-5-of-papers-on-the-use-of-research-in-health-policymaking-tested-interventions-to-see-what-worked-but-those-studies-reveal-a-number-of-strategies-for-improvement/
Next page
Ryan J. A. Murphy
ryan@fulcra.design
ryanjamurphy
Canada
Memorial University of Newfoundland
fulcra.design
Helping changemakers change their worlds through systemic design and with innovation, leadership, and changemaking education.