Три статьи о Wikipedia

На www.citeulike.org (кто не знает, это аналог del.icio.us для научно-библиографических “закладок”) в библиотеке пользователя ChaTo появились три интересные статьи о Википедии:

Analyzing and Visualizing the Semantic Coverage of Wikipedia and Its Authors

Authors: Todd Holloway, Miran Bozicevic, Katy Börner

This paper presents a novel analysis and visualization of English Wikipedia data. Our specific interest is the analysis of basic statistics, the identification of the semantic structure and age of the categories in this free online encyclopedia, and the content coverage of its highly productive authors. The paper starts with an introduction of Wikipedia and a review of related work. We then introduce a suite of measures and approaches to analyze and map the semantic structure of Wikipedia. The results show that co-occurrences of categories within individual articles have a power-law distribution, and when mapped reveal the nicely clustered semantic structure of Wikipedia. The results also reveal the content coverage of the article’s authors, although the roles these authors play are as varied as the authors themselves. We conclude with a discussion of major results and planned future work.

Preferential attachment in the growth of social networks: the case of Wikipedia

Authors: A. Capocci, V.D.P. Servedio, F. Colaiori, L.S. Buriol, D. Donato, S. Leonardi, G. Caldarelli

We present an analysis of the statistical properties and growth of the free on-line encyclopedia Wikipedia. By describing topics by vertices and hyperlinks between them as edges, we can represent this encyclopedia as a directed graph. The topological properties of this graph are in close analogy with that of the World Wide Web, despite the very different growth mechanism. In particular we measure a scale–invariant distribution of the in– and out– degree and we are able to reproduce these features by means of a simple statistical model. As a major consequence, Wikipedia growth can be described by local rules such as the preferential attachment mechanism, though users can act globally on the network.

Measuring Wikipedia

Author: J. Voss

Опубликовано 22/06/2006 Комментариев нет

Новые препринты в arXiv’е

Thresholds for virus spread on networks

Authors: M.Draief; A.Ganesh; L.Massoulie

We study how the spread of computer viruses, worms, and other self-replicating malware is affected by the logical topology of the network over which they propagate. We consider a model in which each host can be in one of 3 possible states - susceptible, infected or removed (cured, and no longer susceptible to infection). We characterise how the size of the population that eventually becomes infected depends on the network topology. Specifically, we show that if the ratio of cure to infection rates is larger than the spectral radius of the graph, and the initial infected population is small, then the final infected population is also small in a sense that can be made precise. Conversely, if this ratio is smaller than the spectral radius, then we show in some graph models of practical interest (including power law random graphs) that the final infected population is large. These results yield insights into what the critical parameters are in determining virus spread in networks.

Интересно, как в этой в общем-то частной задаче важная роль оказывается отведённой значению спектрального зазора в графе, на котором распространяется “инфекция”. Связь спектральных свойств с геометрией графов обсуждается на физическом уровне строгости в препринте Донетти, Нери и Муньоса с игривым названием “Optimal network topologies: Expanders, Cages, Ramanujan graphs, Entangled networks and all that“. Кроме спектрального анализа, этот предмет связан и с комбинаторной оптимизацией, поскольку “изопериметрическая константа”, оценивающая спектральный зазор снизу, в свою очередь может быть оценена через решение задачи о потоках и разрезах. Интересная континуальная переформулировка этого дискретного результата есть в препринте Д. Гризера The first eigenvalue of the Laplacian, isoperimetric constants, and the Max Flow Min Cut Theorem.

Во-вторых, вот интересная историческая статья об источниках и составных частях колмогоровской формулировки теории вероятностей:

The Sources of Kolmogorov’s Grundbegriffe

Authors: Glenn Shafer, Vladimir Vovk

Andrei Kolmogorov’s Grundbegriffe der Wahrscheinlichkeits-rechnung put probability’s modern mathematical formalism in place. It also provided a philosophy of probability–an explanation of how the formalism can be connected to the world of experience. In this article, we examine the sources of these two aspects of the Grundbegriffe–the work of the earlier scholars whose ideas Kolmogorov synthesized.
Опубликовано Комментариев нет