Software mining

Software mining is an application of knowledge discovery in the area of software modernization which involves understanding existing software artifacts. This process is related to a concept of reverse engineering. Usually the knowledge obtained from existing software is presented in the form of models to which specific queries can be made when necessary. An entity relationship is a frequent format of representing knowledge obtained from existing software. Object Management Group (OMG) developed specification Knowledge Discovery Metamodel (KDM) which defines an ontology for software assets and their relationships for the purpose of performing knowledge discovery of existing code.

Software mining and data mining

Software mining is closely related to data mining, since existing software artifacts contain enormous business value, key for the evolution of software systems. Knowledge discovery from software systems addresses structure, behavior as well as the data processed by the software system. Instead of mining individual data sets, software mining focuses on metadata, such as database schemas. OMG Knowledge Discovery Metamodel provides an integrated representation to capturing application metadata as part of a holistic existing system metamodel. Another OMG specification, the Common Warehouse Metamodel focuses entirely on mining enterprise metadata.

Text-Mining Software Tools

Text mining software tools enable easy handling of text documents for the purpose of data analysis including automatic model generation and document classification, document clustering, document visualization, dealing with Web documents, and crawling the Web.

Levels of software mining

Knowledge discovery in software is related to a concept of reverse engineering. Software mining addresses structure, behavior as well as the data processed by the software system.

Mining software systems may happen at various levels:

  • program level (individual statements and variables)
  • design pattern level
  • call graph level (individual procedures and their relationships)
  • architectural level (subsystems and their interfaces)
  • data level (individual columns and attributes of data stores)
  • application level (key data items and their flow through the applications)
  • business level (domain concepts, business rules and their implementation in code)

Forms of representing the results of Software Mining

gollark: However, the actual `reboot` command in the sandbox does *not* reboot it fully.
gollark: I can't get around that.
gollark: No, it does.
gollark: - PotatOS uses a single global process manager instance for nested potatOS instances. The ID is incremented by 1 each time a new process starts.- But each nested instance runs its own set of processes, because I never made them not do that and because without *some* of them things would break.- PotatOS has a "fast reboot" feature where, if you reboot in the sandbox, instead of *actually* rebooting the computer it just reinitializes the sandbox a bit.- For various reasons (resource exhaustion I think, mostly), if you nest it, stuff crashes a lot. This might end up causing some of the nested instances to reboot.- When they reboot, some of their processes many stay online because I never added sufficient protections against that because it never really came up.- The slowness is because each event goes to about 200 processes which then maybe do things.
gollark: WRONG!

See also

  • Mining Software Repositories

References

    This article is issued from Wikipedia. The text is licensed under Creative Commons - Attribution - Sharealike. Additional terms may apply for the media files.