<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing DTD v3.0 20080202//EN" "https://jats.nlm.nih.gov/nlm-dtd/publishing/3.0/journalpublishing3.dtd">
<article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" article-type="research-article" specific-use="SMUR" dtd-version="3.0" xml:lang="en">
<front>
<journal-meta>
<journal-id journal-id-type="publisher">OSD</journal-id>
<journal-title-group>
<journal-title>Ocean Science Discussions</journal-title>
<abbrev-journal-title abbrev-type="publisher">OSD</abbrev-journal-title>
<abbrev-journal-title abbrev-type="nlm-ta">Ocean Sci. Discuss.</abbrev-journal-title>
</journal-title-group>
<issn pub-type="epub">1812-0822</issn>
<publisher><publisher-name></publisher-name>
<publisher-loc>Göttingen, Germany</publisher-loc>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="doi">10.5194/os-2020-62</article-id>
<title-group>
<article-title>Towards operational phytoplankton recognition with automated high-throughput imaging and compact convolutional neural networks</article-title>
</title-group>
<contrib-group><contrib contrib-type="author" xlink:type="simple"><name name-style="western"><surname>Eerola</surname>
<given-names>Tuomas</given-names>
</name>
<xref ref-type="aff" rid="aff1">
<sup>1</sup>
</xref>
</contrib>
<contrib contrib-type="author" xlink:type="simple"><name name-style="western"><surname>Kraft</surname>
<given-names>Kaisa</given-names>
<ext-link>https://orcid.org/0000-0001-6290-3887</ext-link>
</name>
<xref ref-type="aff" rid="aff2">
<sup>2</sup>
</xref>
</contrib>
<contrib contrib-type="author" xlink:type="simple"><name name-style="western"><surname>Grönberg</surname>
<given-names>Osku</given-names>
</name>
<xref ref-type="aff" rid="aff1">
<sup>1</sup>
</xref>
</contrib>
<contrib contrib-type="author" xlink:type="simple"><name name-style="western"><surname>Lensu</surname>
<given-names>Lasse</given-names>
<ext-link>https://orcid.org/0000-0002-7691-121X</ext-link>
</name>
<xref ref-type="aff" rid="aff1">
<sup>1</sup>
</xref>
</contrib>
<contrib contrib-type="author" xlink:type="simple"><name name-style="western"><surname>Suikkanen</surname>
<given-names>Sanna</given-names>
</name>
<xref ref-type="aff" rid="aff2">
<sup>2</sup>
</xref>
</contrib>
<contrib contrib-type="author" xlink:type="simple"><name name-style="western"><surname>Seppälä</surname>
<given-names>Jukka</given-names>
</name>
<xref ref-type="aff" rid="aff2">
<sup>2</sup>
</xref>
</contrib>
<contrib contrib-type="author" xlink:type="simple"><name name-style="western"><surname>Tamminen</surname>
<given-names>Timo</given-names>
</name>
<xref ref-type="aff" rid="aff2">
<sup>2</sup>
</xref>
</contrib>
<contrib contrib-type="author" xlink:type="simple"><name name-style="western"><surname>Kälviäinen</surname>
<given-names>Heikki</given-names>
<ext-link>https://orcid.org/0000-0002-0790-6847</ext-link>
</name>
<xref ref-type="aff" rid="aff1">
<sup>1</sup>
</xref>
</contrib>
<contrib contrib-type="author" xlink:type="simple"><name name-style="western"><surname>Haario</surname>
<given-names>Heikki</given-names>
</name>
<xref ref-type="aff" rid="aff1">
<sup>1</sup>
</xref>
</contrib>
</contrib-group><aff id="aff1">
<label>1</label>
<addr-line>Computer Vision and Pattern Recognition Laboratory, School of Engineering Science, Lappeenranta-Lahti University of Technology LUT, Finland</addr-line>
</aff>
<aff id="aff2">
<label>2</label>
<addr-line>Finnish Environment Institute, Marine Research Centre, Helsinki, Finland</addr-line>
</aff>
<pub-date pub-type="epub">
<day>08</day>
<month>07</month>
<year>2020</year>
</pub-date>
<volume>2020</volume>
<fpage>1</fpage>
<lpage>20</lpage>
<permissions>
<copyright-statement>Copyright: &#x000a9; 2020 Tuomas Eerola et al.</copyright-statement>
<copyright-year>2020</copyright-year>
<license license-type="open-access">
<license-p>This work is licensed under the Creative Commons Attribution 4.0 International License. To view a copy of this licence, visit <ext-link ext-link-type="uri"  xlink:href="https://creativecommons.org/licenses/by/4.0/">https://creativecommons.org/licenses/by/4.0/</ext-link></license-p>
</license>
</permissions>
<self-uri xlink:href="https://os.copernicus.org/preprints/os-2020-62/">This article is available from https://os.copernicus.org/preprints/os-2020-62/</self-uri>
<self-uri xlink:href="https://os.copernicus.org/preprints/os-2020-62/os-2020-62.pdf">The full text article is available as a PDF file from https://os.copernicus.org/preprints/os-2020-62/os-2020-62.pdf</self-uri>
<abstract>
<p>&lt;p&gt;Plankton communities form the basis of aquatic ecosystems and elucidating their role in increasingly important environmental issues is a constantly present research question. The concealed plankton community dynamics reflect changes in environmental forcing, growth traits of competing species, and multiple food web interactions. Recent technological advances have led to the possibility of collecting real-time big data opening new horizons for testing core hypotheses in planktonic systems, derived from macroscopic realms, in community ecology, biodiversity research, and ecosystem functioning. Analyzing the big data calls for computer vision and machine learning methods capable of producing interoperable data across platforms and systems. In this paper we apply convolutional neural networks (CNN) to classify a brackish-water phytoplankton community in the Baltic Sea. For solving the classification task, we utilize compact CNN architectures requiring less computational capacity and creating an opportunity to quickly train the network. This makes it possible to (1) test various modifications to the classification method, and (2) repeat each experiment multiple times with different training and test set combinations to obtain reliable results. We further analyze the effect of large class imbalance to the CNN performance, and test relevant data augmentation techniques to improve the performance. Finally, we address the practical implications of the classification performance to aquatic research by analyzing the confused classes and their effect on the reliability of the automatic plankton recognition system, to guide further development of plankton recognition research. Our results show that it is possible to obtain good classification accuracy with relatively shallow architectures and a small amount of training data when using effective data augmentation methods even with a very unbalanced dataset.&lt;/p&gt;</p>
</abstract>
<counts><page-count count="20"/></counts>
<funding-group>
<award-group id="gs1">
<funding-source>Research Council of Finland</funding-source>
<award-id>321980</award-id>
</award-group>
<award-group id="gs2">
<funding-source>Research Council of Finland</funding-source>
<award-id>321991</award-id>
</award-group>
<award-group id="gs3">
<funding-source>H2020 Excellent Science</funding-source>
<award-id>JERICO-NEXT - Joint European Research Infrastructure network for Coastal Observatory – Novel European eXpertise for coastal observaTories (654410)</award-id>
</award-group>
</funding-group>
</article-meta>
</front>
<body/>
<back>
</back>
</article>