Friday, May 12, 2017

As with most technology conferences in data science these days (including ML/IoT/etc…) the majority of talks were current, interesting, sometimes self-serving, but overall very insightful. So I will start by acknowledging that the organizers of the Big Data Innovation Summit in San Francisco this week did a nice job in assembling a group of speakers that reflected the current state of the industry in all things data science.  However, I do have an issue with the blatant lack of female presenters over the course of the two day event. I admit that I did not attend every session over the two days, but I did attend the majority, and it was not until the afternoon of the second day that I heard a female speaker who was not acting as a moderator. In a time where we should be promoting the examples of female-led teams in Data Science and related areas of study, attending a conference that is located down the road from one of the world’s epicenters for data science talent and seeing such a lack of representation among the speakers is concerning. 
The data science team that I lead at TWC/IBM is 60% female. While a small portion of my team is on board thanks to an IBM initiative specifically designed to recruit females into technology careers (which I have written about here), even without this program, there would still be roughly a 50/50 male-female split on my team. This is simply because I look to bring in the best associates for the positions that are available. And I know that in talking with other science/technology leaders in similar roles at other companies, while the distribution might not be as balanced as it is on my team, it certainly is not representative of what was seen at this event among the speakers. Further, a simple search of teams doing interesting and innovative things in data science will reveal that there is no shortage of high performing and innovative female-led teams in industry, academia and government. 
There is no secret that gender bias exists in science and technology. However, a high profile event that is so one sided with respect to speaker gender is clearly sending the wrong message. Comments/thoughts are welcome.

Predictive Analytics Summit 2017, San Diego

The Predictive Analytics Summit was held in San Diego this week. Kudos to Innovation Enterprise for doing a very good job in pulling together a diverse line up of speakers, representing a balanced cross section of analytics/data science leaders at their respective companies. This does not always happen at data conferences. Listening to several of the talks throughout the first day was a good way to take the pulse of analytics integration across multiple industries. It was also good to hear that while many of the organizations represented are facing similar challenges, they are indeed bullish on the opportunities ahead of us. In addition, the talks served as a good sense-check to gauge how ‘data science’ has evolved; not as a discipline, but as a way of thinking about technology and analytics, and their integration into any type of commercial organization. Remember that Data Science and analytics are not just for the Bloomberg’s, the Google’s and the IBM’s of the world. Fortunately for me, my talk came at the end of the day, giving me the ability to weave some of what I heard throughout the day into the discussion. A few of the points that I stressed in my talk follow:
  • Let’s get past the ‘Data Science’ moniker. There is no one-size-fits-all data scientist. Just as a biologist can range from a molecular biologist working on genomic sequencing to a field ecologist working in wildlife biology, a data scientist can be many different things. Being cognizant of other skill sets that contribute to data science is not the same thing as being an expert in everything, which given the pace and scope of the tools and technologies involved in analysis and machine learning, is nothing short of impossible. 
  • Domain expertise is undervalued. This is related to the point above, but goes further in that while many people can be fantastic technical specialists, understanding what they want to get out of the data, and translating the findings into something of value is not a simple thing to do. Also, the perspective that an individual can bring when looking to develop solutions in a team environment is important, as is creativity in the approach. Going to the bias coming from my entry into data science (mathematics & physics), I can almost look at the team that I want as a Complex Adaptive System, evolving along with profession and the relevant technologies. (See writings of Yaneer Bar-Yam of the New England Complex Systems Institute for more on this).  
  • It is also important to be clear on the distinction between the roles of data science and business analyst. This is not in any way an attempt to diminish the value of an analyst, but the work of an analyst by definition does differ from the role of a scientist. A scientist needs to be looking at problems and asking questions of the data, constructing an environment that is amenable to machine learning, and needs to be thinking of scale. Many roles that fall under data science easily can default to an analyst.
  • Data Science/Analytics/Machine Learning is not a Zero-Sum Game. When new improvements ranging from insights to technologies enter the market, the entire ecosystem benefits.  
  • Diversity & STEM. We do not only hire students right out of their masters or PhD program. IBM has been a very big proponent of expanding the opportunities to non-traditional candidates. For more on this, continue reading here.
  • The Wisdom of Crowds and Blind Faith in Models. I feel that this is one of the most important themes in data science today, and one which I will expand upon in future articles. Data science and machine learning are centered around deriving value from data and building quantitative solutions that (hopefully) have some predictive capacity. And we are getting very good at doing this. However, the misuse of models and trust that can be put into learning systems, without skepticism, can also be dangerous. Here, I can incorporate the many years that I spent (and still spend) in the commodities sector, where I built models that pulled diverse data to build market positions around risk. Every day at the end of the trading session, my scoreboard was the market. If my thesis was correct, there was confirmation, and if wrong, there was nowhere to hide. But this transparency is not always present in other disciplines that require decision support. What this should force data science/machine learning practitioners to do is to always be suspicious of models, even if they ‘know’ that they are accurate. There will always be something that will happen tomorrow that today’s model equations do not account for, so there are no true quantitative absolutes. It follows that an appropriate course should be to always anticipate that your model or predictive system will fail, and to then build into your structure of decisions a way to protect against the downside associated with bad recommendations.
This conversation will extend through more talks at upcoming conferences and forums. Thanks again to Meg Rimmer and the Innovation Enterprise staff for organizing an engaging and worthwhile event.

Reflections on reThinking Food: Agriculture Data as a Platform

The reThink food 2016 event has come and gone, and once again, it did not disappoint. Any time spent at the Culinary Institute of America in St. Helena is a treat, but the discussions, and the evolution of the conference itself since I first attended two years ago, have taken on bigger and more pressing challenges and opportunities, with data being the central theme.
A conference of this nature runs the risk of focusing the discussions towards very esoteric topics, with no real potential applicability beyond the test laboratories, computational models or experimental food kitchens of the attendees. This was not the case at reThink. Each session and discussion throughout the three-day tech (& food) feast truly stressed the importance of open data, participatory science and markets, trials and transparency, and the goal of providing solutions that will impact global populations dealing with issues related to both food and nutrition security. I was honored to share the stage with Caleb Harper and Kevin Esvelt of the MIT Media Lab to close the conference, and hopefully we left the attendees with something to think about in advance of next year. If anyone in the food/sustainability space needs to remind themselves every once in a while why they are doing what they are doing, just google one of Caleb’s videos and it becomes clear. And Kevin’s discussion of his work on the use of gene drive technology which touched not only on the science, but also on the philosophical dimensions related to the future of agriculture, had me downloading several papers for the flight home.
It is crucial to emphasize one component associated with future global agriculture systems which I believe will only grow in importance in the coming decades: the need for open and distributed data. The 2015 Wharton IGEL report ‘Feeding the World’ notes that 1 in 9 of the world’s population do not receive adequate food and nutrition, and it is as much a logistics and distribution issue as it is a production issue. Water stress will only exacerbate the situation in the coming decades. As data pertaining to all aspects of the food system is coming online, increasing in volume and frequency by the day, it is necessary to make the case that this information is only useful if it is decentralized, open, and in an accessible and useable format. Also, the question really is not around the volume of data, but instead what to do with the data. Agricultural information now spans in scale from the genome to the biome, and viewing the food production, distribution and market system as a platform requires a shift in how we view the most basic necessity for life. The Internet of Things (IoT), by definition, is built upon the collection, dissemination and utilization of information pertaining to nearly all areas of society. It can be argued that agriculture can and should be one of the foundational components of this IoT network.
Work in the OpenAG group at the MIT Media Lab is at the forefront of this shifting dialogue around food data. What if we could code for specific environmental variables, whereby ‘climate recipes’ could be tested, optimized, and shared, with the result being a global library of open phenotypes as part of a truly distributed agricultural system? What if, instead if relying so heavily on monoculture (which will always have a place), a supplementary food production system capable of optimizing both controlled-environment and traditional agricultural practices were able to fill the gaps where food requirements are not secure? What if we could select for specific parameters that could grow more food while conserving biological and physical resources in the process? These are not issues that sit solely in the domain of science and technology - they can also be the foundation for significant commercial opportunities.  Taken further, ideas related to a digital agriculture platform can quite possibly evolve to serve as a risk management tool against factors such as weather and climate variability, water stress and supply chain disruption. Looking ahead, I can imagine future reThink and other related conferences addressing such topics as:
  • Making agriculture a central foundational element of IoT
  • Bringing cognitive technologies to phenotype selection via climate recipes
  • Seed through consumption - opportunities for the Circular Economy
  • Transparency: Open Access and Global Participation
  • Redefining biodiversity
I often hear those describing AI and Cognitive as being in the top of the first inning. I feel that with respect to agriculture data as a commercial and technological platform, we are still warming up as the game has not even started. Looking forward to reThink 2017.
+Hats off to my Media Lab colleagues Hildreth England and Caleb Harper, as well as Nicki Briggs, for organizing an energizing and entertaining event.

Reflections on CYPHER-2016

Connected Devices.  Artificial Intelligence.  Machine Learning.  Cognitive Computing.  Semantic Web.  The Internet of Things (IoT) & the Internet of Everything.  Being at the CYPHER 2016 India Analytics Summit was like spending a few days at an all-you-can-absorb technology buffet, with discussions spanning all things analytics.  Equally impressive was my precursor to the summit, a day at the mu sigma Innovation Lab.  However, after returning from the Summit which was held in Bangalore last week, two thoughts are top of mind.  First, there is an amazing amount of potential and enthusiasm in India and elsewhere around the prospects associated with connected devices, which can lead to dramatic economic, social and environmental benefits.  Second, as some real definitions are finally being formed around what I will call the basket of IoT technologies, the need for domain experts in the above mentioned technologies is now more clear than it ever has been.  Realization of this need will determine the course that guides the evolution of IoT technologies over the coming decades.
As I noted in my pre-conference interview, I am excited about the potential of IoT and the prospect for connected devices to improve lives.  And by this I mean the majority of lives, who up to this point, may not even be part of the connected economy.  I stressed this point numerous times again in my talk, as I do elsewhere.  The same behavioral and location-based methods that can be used to better understand group consumer behavior can also be applied to situations involving pollution prevention, contagion modeling, urban planning, market development, and disease transmission, among many other things.  The more domain experts that participate in open science and collaboration from a variety of backgrounds, the more that methods will be shared, which will in turn benefit science, business and society.
Not this is not to say that technical competency is overemphasized; in fact, it is just the opposite.  Having technical expertise in topics under the analytics umbrella is a prerequisite for doing something useful in this space, whether it be machine learning, quantitative modeling, commercial feasibility analysis, or infrastructure engineering.  But those who have have technical fluency in one or more areas coupled with an appreciation of how the technology can be applied in different commercial and social settings, which comes from a domain expertise, will be driving the agenda.  I heard far too many discussions of technologies and approaches with an ‘analytics will solve all’ mantra.  Digging a little deeper, many of these ideas are not commercially feasible or they are so specialized that they will have a hard time finding a significant market, which is ultimately needed to prove out a concept.  It should be noted that this is not something that is unique to this particular conference - it only underscores what seems to be driving the discussions I hear elsewhere.
Again, I am bullish on IoT.  And by IoT I am more inclined to envision a world where technology is always active but is more seamlessly integrated into the background noise of our lives.  Robots performing repetitive tasks is one thing, but think a little bigger and imagine how connecting people all around the world and giving them access to participate in global markets, with price transparency, will change commercial activity for the better.  This can serve as the catalyst for real supplier-to-consumer trade, which is by definition the foundation for a market. And doing this in a way that it is less intrusive to participants by providing access to information, rather than prescribing advice, will allow for the continuation of regional and subregional markets to flourish.  Sidenote: It was fitting that I picked up Parag Khanna’s Connectography at the airport for the flight home, which seems to resonate with many of these views embracing technology and informatics, with an optimistic and participatory vision for the future.
Informatics and analytics will continue to play a larger part in our lives, and this is a good thing.  However, it is important to balance the development of the tools and technologies with an appreciation for the markets where they are intended to operate, and this includes placing the evolution of IoT discussion in its’ appropriate historical, economic and social context.

Wednesday, February 15, 2017


Petrichor → Auld Lang Syne → Suzy Greenberg

Climate Scientist opportunity @ICF, Washington DC

Climate Scientist


ICF is seeking a Climate Scientist to support our Climate, Energy Efficiency, and Transportation line of business in Washington, DC.  The majority of the work will support the Climate Adaptation and Resilience portfolio, which works at all levels of government and with commercial clients to understand and manage risks posed by climate change and to enhance resilience of energy and water supplies, transportation and cyber systems, other critical infrastructure, and the vital societal and natural elements that make our communities whole. Improving resilience requires proactive planning and leadership, from the federal level to the municipal level and successful implementation relies on clearly communicated and peer-reviewed science and strong public-private partnerships.
The consultant chosen for this position will provide technical expertise related to the analysis and interpretation of weather and climate data to support this work.
This position will be based in Washington, D.C.
What you’ll be doing…
  • Serve as a technical expert on multi-disciplinary teams on projects focused on climate adaptation and resilience projects
  • Provide technical expertise and activity leadership related to the analysis of weather and climate data
  • Prepare presentations, reports, memoranda, and other communication materials as needed.
  • Complete tasks in fast-paced and self-motivated environment in a timely and efficient manner.


What you’ll need to have…
  • Master’s degree in atmospheric science, meteorology, geography, or Earth science, with a focus on physical climate science.
  • 6+ years of relevant experience.
  • Ability to analyze, interpret, and appropriately apply climate model output and meteorological observations for climate impact and risk analyses.
  • Outstanding oral and written communication skills; ability to communicate complex climate information to non-scientists to inform decision making processes
  • Organized, detail oriented, and the ability to prioritize and multi-task
  • Skilled in one or more programming languages (e.g., R,)  to process large environmental data sets
  • Proficiency in MS Office Applications (Word, PowerPoint, Outlook, Excel)
Our Preferred Skills/Experience…
  • Doctorate degree in atmospheric science, meteorology, geography, or Earth science, with a focus on physical climate science
  • Technical and team management experience for complex projects in a consulting environment
  • Knowledge of climate downscaling methodologies
  • Ability to analyze, interpret, and appropriately apply coastal climate data information (e.g., tide gauges and the analysis of sea level rise) and models (e.g. surge models)
  • Knowledge of recurrence interval calculation methodologies
  • Fluency in use of climate information across time scales
  • Use of climate information in developing countries
  • Understanding of the fundamentals of climate vulnerability and risk assessment and adaptation
  • Strong client relationships and contacts
Professional Skills…
  • Strong communications skills, including writing and presentations
  • Strong analytical, problem-solving, and decision making capabilities
  • Team player with the ability to work in a fast-paced environment
  • Ability to be flexible to handle multiple priorities
  • Well-developed time management skills
About ICF
ICF (NASDAQ:ICFI) is a global consulting and technology services provider with more than 5,000 professionals focused on making big things possible for our clients. We are business analysts, public policy experts, technologists, researchers, digital strategists, social scientists and creatives. Since 1969, government and commercial clients have worked with ICF to overcome their toughest challenges on issues that matter profoundly to their success. Come engage with us at
ICF offers an excellent benefits package, an award winning talent development program, and fosters a highly skilled, energized and empowered workforce.
ICF is an equal opportunity employer that values diversity at all levels. (EOE – Minorities/Females/Veterans/Individuals with Disabilities/Sexual Orientation and Gender Identity)

Primary Location

 : United States-District of Columbia-Washington

Wednesday, September 7, 2016

Phish observatory, George, WA

.@phish Nice OBS shoutout by Trey at around 3:32  to observe the night sky.  Nice

Monday, May 2, 2016

Space 2.0 takes off

(photo taken at Astro Digital)
After returning from last week’s Space2.0 conference in Silicon Valley last week, a number of assumptions were confirmed.  Chief among the confirmations: (a) there is significant value to be generated from location-based information derived from space-based technologies, and (b) achieving value will not be easy.  

Sensors, small-sats, machine learning, computer vision, rapid analytics, new markets, etc....  These were just a few of the many topics discussed over the course of the three day meeting, and there is plenty of room for optimism regarding Newspace.  But as many speakers noted, optimism needs to be viewed cautiously.  As a well-known game changer in the space industry (and electric cars, and solar energy, and transportation, and…) has stated, ‘Space is hard’.  Taking new concepts to market is difficult in any industry.  Then when we add on risks associated with launch failure, changing attitudes towards privacy, and the proliferation of new sources of data from both ground and space, we can easily see how months can turn into years.  Difficulty notwithstanding, those of us in the space-based information industry have always known that there is something ‘there’.  However, monetizing what is there towards a cost effective commercial application has typically proven to be a significant challenge, with many more misses than hits.  With the avalanche of new data coupled with the computational resources to analyze ever increasing volumes, there does seem to be a new sense of optimism, and the future in my view is bright.  While many are in search of the killer app, practicality will trump flash, and solid technology platforms that address specific business challenges would appear to be better bets, from both a business adaptation perspective as well as a funding perspective, as many of the participants in attendance from the venture community can attest to. Further, as more back end moves to the cloud, this frees up workers to creatively engage with their data, spending more time thinking about how their structures and solutions fit the needs of commercial customers, rather than managing and organizing (the 80% of data science).  It follows that partnerships will be necessary as a key ingredient for success for both large and small players.  Also, no emerging company, no matter how innovative the technology may be, will receive the necessary funding and support to mature, without a solid commercial proposal.  The space sector is not one where outsized returns are realized over a short time frame. Patience is needed on the part of the funders, and this needs to be balanced with the needs and focus of a solid technology platform, complimented by a management team that can balance innovation with focus.

In my opinion, this is a great time for both established and fledgling companies to be participating in Newspace.  The opportunities are everywhere, but this does need to be balanced with focus.  My hope, and expectation, is that many of the views and aspirations expressed at Space2.0_2016 will be on their way to becoming a reality at Space2.0_2017.