Commons talk:WMF support for Commons/Commons community calls

From Wikimedia Commons, the free media repository
Jump to navigation Jump to search
This is the talk page for discussing improvements to Commons:WMF support for Commons/Commons community calls.

Priorities from the perspective of a frequent user and re-user (inside and outside Wikimedia projects)

[edit]

Posting here just in the case I will miss tomorrow's call. I am very grateful for this opportunity, thank you for listening and considering <3 !

My perspective: I very frequently edit Wikimedia Commons, with the focus of describing the media there as accurately and reliably as possible, and making the media there usable and re-usable by the world (not just Wikimedia projects), in full agreement with the Wikimedia movement strategy.

Professionally I also currently lead a project by a government agency which frequently re-uses media from Wikimedia Commons (probably often media which is not used in Wikipedia at all). You can see some of the usage here. Besides this visible re-use, we also rely on search and querying of Wikimedia Commons and Wikidata to find more media and data, which is harder to track down. As project manager I can say our usage and data retrieval goes up to 10,000s to 100,000s of Wikidata items and Commons files.

I have worked on media databases (broadly speaking) professionally since the early 2000s. My native language is Dutch and I am very aware that the majority of the world doesn't speak a word of English. We have the tools in Wikimedia projects to serve this majority of the world if we decide to leverage them.

High-level wishes from these perspectives:

  • In terms of content organization, multilingual discoverability and ease of re-use, structured data is vastly superior. A part of the Wikimedia Commons community is very attached to Wikitext and categories, and I heard that they matter for discoverability too; therefore I still use them. Mainly as duplicate work on top of adding structured data - I would be able to use my sparse volunteer time more efficiently if this were not needed. For re-using and searching, structured data is the way to go. Commons should be a structured database just like any other contemporary digital assets management system.
  • Commons is a knowledge platform, not a stock images website. If I want to find a free picture of a dog or a rainbow, I will use a generic search engine. The unique strength of Wikimedia Commons is that we describe and contextualize very specific things (a specific church at a certain point in time, a specific occurrence of an animal or plant in a specific location...). Generic search engines can't help searching for such specific things at specific times and spaces, but we can build (and partly already have) the unique and very helpful infrastructure to achieve that. We should further develop search and browsing for discovery of such specificity. For discovery of media related to general topics, IMO it's better to e.g. work with general-purpose search engines, perhaps focusing on mission-aligned ones (e.g. DuckDuckGo?), to make our general-scope media more generally discoverable there.
  • Generally make structured data more visible to contributors so that there is more incentive to improve it.
  • Design updates to SDC should encourage editors to edit with precision and accuracy (sourced, correct, not generic but specific).

More specific wishes and requests I'm currently thinking of:

  • Remove authentification from WCQS so that Wikimedians, and cultural and other knowledge organizations around the world can perform federated and shareable Wikimedia Commons queries.
  • Improve MediaSearch so that it shows (structured) metadata of each file by default (not needing a click).
  • Add faceted search to MediaSearch.
  • Persistent faceted search results can become new-style galleries. It would be great to be able to have persistent URIs for specific faceted search results, multilingually ("Korenmolen de Distilleerketel in de 19de eeuw")
  • Show structured data on file pages by default and more prominent than Wikitext (not in a separate tab anymore)
  • In order to be able to re-use gadgets and scripts from Wikidata, and to provide a unified experience, make sure SDC has the generic Wikibase/Wikidata design (i.e. revert the decision to have Commons-specific UI for SDC).

Thanks! Spinster (talk) 09:39, 20 November 2024 (UTC)[reply]

As someone who works on Wikidata scripts/gadgets a lot, the biggest problem for those (by far) is the lack of Javascript hooks. I can adapt scripts to support different HTML structures, but they won't work if they don't run at the right time.
Also, links to some relevant tickets:
  • phab:T327076 - UI for structured data on Commons should have the same Javascript hooks as Wikidata
  • phab:T341781 - Show structured data by default
  • phab:T297995 - Remove authentication from Wikimedia Commons Query Services (WCQS)
  • phab:T337106 - Faceted, structured data-based MediaSearch on Wikimedia Commons
- Nikki (talk) 16:13, 20 November 2024 (UTC)[reply]
In terms of content organization, multilingual discoverability and ease of re-use, structured data is vastly superior. Strongly disagree. It's basically redundant to categories and just duplicates the work. Most files do not have structured data and those that have them do not have most major subjects or as many things set as the categories. Most of the SD that are set have been set using the categories. It's wishful thinking and is SD is a resource-sink without much need for it when it comes to subjects depicted. Moreover, categories can also be multilingual – it's just one of many cases where people think SD is needed or better when it's not. See Add machine translated category titles on WMC.
Improve MediaSearch so that it shows (structured) metadata of each file by default Also strongly oppose – instead make it show the categories which unlike SD are well-maintained, usually fairly complete and not polluted with unrelated or vandal depicts data.
make structured data more visible to contributors so that there is more incentive to improve it just wastes precious scarce volunteer time to duplicate work that has already been done via file categories.
For discovery of media related to general topics, IMO it's better to e.g. work with general-purpose search engines, perhaps focusing on mission-aligned ones (e.g. DuckDuckGo?), to make our general-scope media more generally discoverable there. People also search for relatively niche things with Web search engines (e.g. a specific river from space at sunlight) and the problem is that WMC is not well indexed there. Videos are not showing in DuckDuckGo Videos at all for example. See Do something about Google & DuckDuckGo search not indexing media files and categories on Commons.
Please accept the reality of structure data and categories. Prototyperspective (talk) 19:22, 20 November 2024 (UTC)[reply]
The category system is broken in a lot of ways. It doesn't scale well to the size of Commons, and is causing stability issues. Tiny intersection categories ("Red apples with green spots sitting on blue plates in November 2024") are common but make actually using the category system to find every picture of a red apple difficult. All of this and more is solved by structured data, but migrating all of the existing category-based data to structured data absolutely is a challenge. The tools to work with structured data are often barely functional, and WCQS has been an afterthought since it was introduced. But that doesn't mean we should look backward instead of forward. AntiCompositeNumber (talk) 15:46, 21 November 2024 (UTC)[reply]
  1. It's not broken at all.
  2. For scaling you seem to be referring to phab:T343131 which can be addressed in various ways such as maybe better caching or removing redundant meta-categories (or moving these to SD since they are not about the content).
  3. [overspecific intersection categories] make actually using the category system to find every picture of a red apple difficult. 1. Not an issue of categories. 2. Not addressed with structured data. 3. Addressed with the Deepcat gadget which would be greatly improved if the deepcategory search operator issues like phab:T376440 were fixed and could be improved upon (e.g. specify depth or exclude certain subcats of Red apples like "Red apples in fiction") and with this highly supported wish.
  4. Those overspecific categories if anything are a problem and often they are getting upmerged and if not you could propose that but there should also be a category the user would navigate to that contains more of these files instead of many deep overspecific cats. Moreover, many of these by date categories should be redundant by enabling users to sort, search and/or filter (also see phab:T329961 & phab:T329961) by content in the {{Information}} template like the date= field which is something quite overdue as there is so much useful metadata in there that it should be searchable / part of filters.
  5. All of this and more is solved by structured data That is denying the reality and wishful thinking. None of these things have been solved or solved to any notable degree.
  6. that doesn't mean we should look backward instead of forward Just because something is new doesn't make it better. When it comes to subjects depicted, forward are categories, putting one's head in sand and arguing with what one idealogically wish was true is structured data.
Prototyperspective (talk) 16:25, 21 November 2024 (UTC)[reply]
... everyone breathe :)
  • an image gallery including subcats is a great bandaid.
  • If we're redesigning things to make more sense: combination categories are a bit of a misuse of the theoretical concept of cats. "X in fiction" should be in categories "X" + "in fiction". Then <adjective> <adjective> <adjective> <noun> <in context> <in context> would be in six atomic categories, with a large number of possible combination categories. Then we need indexes and views that allow seeing all of the "red, decaying, food, on flatware" which will show red apples with green mold spots on blue plates.
--SJ+ 14:40, 24 November 2024 (UTC)[reply]

Perennial needs

[edit]

Commons:Requests for comment/Technical needs survey. RoyZuo (talk) 11:34, 20 November 2024 (UTC)[reply]

@RoyZuo Thanks, we already discussed internally the result of this survey, and we tried to include as much as possible its findings into our roadmap. Sannita (WMF) (talk) 10:26, 25 November 2024 (UTC)[reply]
And also Commons-related Community Wishlist proposals. I hope both are being discussed in the community calls and looked into instead of the CC kind of sidelining/duplicating these – if I was able to attend I would only bring up these two resources and various specific already-existing proposals in them and ask for increasing technical development as described here. Prototyperspective (talk) 15:32, 10 December 2024 (UTC)[reply]

summary of calls

[edit]

how did the two sessions yesterday go? Arlo James Barnes 20:17, 22 November 2024 (UTC)[reply]

@Arlo Barnes Thanks for the question. The calls went well, we will publish the notes in the next days. Please have a bit of patience, because we need to give them a bit of structure. Sannita (WMF) (talk) 18:02, 23 November 2024 (UTC)[reply]
Do these meetings ever have etherpads or collective notes that the attendees can contribute to? That makes some community meetings easier to follow --SJ+ 14:40, 24 November 2024 (UTC)[reply]
@Sj We collected feedback on an internal document, but I'll ask if we can move to Etherpad for the next calls. Sannita (WMF) (talk) 10:25, 25 November 2024 (UTC)[reply]
Did I missed them or are the notes from the session still not published? GPSLeo (talk) 15:40, 8 December 2024 (UTC)[reply]
@GPSLeo Still not published, sorry we're so behind on this, we're trying to summarise them (also for internal use). Sannita (WMF) (talk) 15:41, 8 December 2024 (UTC)[reply]
@Sannita (WMF) I suspect other editors may have mentioned this in the past, but it is rather ironic that Wikimedia is an information platform used by tens of thousands of community members every day to discuss and resolve issues, yet the Foundation so often seems unwilling or unable to use this platform. The foundation wouldn't be struggling to summarize and publish the discussions, if those discussions had simply happened on-wiki.
The fact that those discussions aren't already accessible is a transparency problem. The fact that the discussions are to be "summarized" needlessly raises further transparency and trust issues. I want to clarify that "trust" issue - I'm confident your people will do their best to summarize the discussion fairly and accurately. The issue is that many problems are rooted in miscommunication and misunderstandings. Some wiki-cultural or wiki-contextual subtleties have been notoriously difficult to communicate across the community-foundation interface. Non-transparent process attempting to "summarize" discussions with the community are only liable to escalate any miscommunications or misinterpretations.
For some editors, having to sign up for a live chat at appointed time on some arbitrary other-platform are burdensome or prohibitive constraints on who is permitted to participate. Some contributors have real life commitments and can't attend at the assigned time. Some contributors have less predictable lives and can't commit to a scheduled time. Some contributors find live chat too stressful or too constraining. Some of our contributors are attracted to wiki work exactly because the wiki allows them to contribute and to respond in whatever time and manner makes them most comfortable. The Foundation has noted many times a substantial percentage of contributors are de facto excluded, for whatever reason, if people are required to go off-wiki to participate. Alsee (talk) 11:23, 16 December 2024 (UTC)[reply]
@Alsee I know, and I take full blame for it. In my defense, I had also other projects to follow and to close before the end of the year, while also organising the December call. I will post the summary of the calls during the week, and let you know about it. Sannita (WMF) (talk) 14:03, 16 December 2024 (UTC)[reply]

@Arlo Barnes, Sj, GPSLeo, and Alsee: The summary of the November conversation is now available on a subpage. We're working on the December's call summary, and we expect to publish it in early January. Sorry for keeping you wait, we hope to speed up the process for the next calls. Sannita (WMF) (talk) 14:34, 18 December 2024 (UTC)[reply]

Thank you, very helpful. :) Honoring the comment by @Alsee, I would appreciate any upgrade to the "real-time meeting workflow" that leads to automatic publication of summary transcripts, even if they are updated later with a more accurate or more useful one. In addition to being more inclusive, asynchronous discussions mediated by text also seem easier to search through and less expensive to organize, translate, multitask around, and sustain over long periods of time. So maybe we could aim for a certain ratio of asynch vs real-time Q&A: a few rounds of asynch for each real-time call... --SJ+ 14:55, 6 January 2025 (UTC)[reply]

My comments based on Sandra's and Nikki's comments - Jane023

[edit]

Quick reorganisation of Sandra’s and Nikki’s comments to be able to refer to these issues by number:

1) Remove authentification from WCQS so that Wikimedians, and cultural and other knowledge organizations around the world can perform federated and shareable Wikimedia Commons queries. phab:T297995 - Remove authentication from Wikimedia Commons Query Services (WCQS)

2) Improve MediaSearch so that it shows (structured) metadata of each file by default (not needing a click).

3) Add faceted search to MediaSearch. phab:T337106 - Faceted, structured data-based MediaSearch on Wikimedia Commons

4) Persistent faceted search results can become new-style galleries. It would be great to be able to have persistent URIs for specific faceted search results, multilingually ("Korenmolen de Distilleerketel in de 19de eeuw”) This is a popular windmill today that was a ruin in the the early 20th-century - see nl:De Distilleerketel

5) Show structured data on file pages by default and more prominent than Wikitext (not in a separate tab anymore) phab:T341781 - Show structured data by default

6) In order to be able to re-use gadgets and scripts from Wikidata, and to provide a unified experience, make sure SDC has the generic Wikibase/Wikidata design (i.e. revert the decision to have Commons-specific UI for SDC). phab:T327076 - UI for structured data on Commons should have the same Javascript hooks as Wikidata

On categories: I am going to skip the category discussion because though I love Commons (and Wikipedia) categories and use HotCat and Cat-a-lot quite a bit on Commons categories for heritage sites and artist categories, I have given up on the “category or item” discussion in favour of both when and if possible. On 2) I feel that Commons categories are much more inefficient for search than structured data, but because of all the restrictions on practical use of WCQS (it’s so well hidden!) I prefer Wikidata search. My main issue with categories these days is that when I go to track down painting files in some language Wikipedia I don’t speak or read, I am shocked that when I click on the file it doesn’t take me to my “ normal” Commons UI, but takes me by default instead to some UI that doesn’t give me any commons categories at all. My main comment on 1) is that this authentication feature is the reason I don’t use WCQS at all. My main comment on 5) is that I occasionally get confused when I update the Wikidata item in a Commons file but the file is still showing the data from the old Q number and I have to go in and change the Q number there too. This hasn’t happened recently so no idea if it has been fixed. For point 6) I agree, but I would wish to keep my default setting across all Wikimedia projects based on my default language version on that project (currently it seems I get the “ not logged in” version as soon as I leave one and enter another).

On copyright files: All of that being said, one thing I really like is the effort to improve multi-lingual copyright labels outside of the complicated templating that we have had since 2010. As a paintings enthusiast with a fondness for Dutch 17th-century art, I am happiest with high resolution images of such paintings and always eager to see the best and version in use. I am a big fan of detail images of paintings and recognise the challenges when all we have are details and are missing an image of the whole painting. Occasionally I stray outside of my safe “PD-old-expired-100” and I find it very confusing at times to see that we are using any image but most weirdly, an image of the signature for a copyrighted painting with no indication of the reason.

On "Commons file gaps": As a member of a gendergap workgroup I am also always surprised by the lack of any gender categories (though with various gender discussions these are as problematic as ethnic or melanin-toned categories). In the case of missing paintings, I have looked for ways to show the gap, and of course this is currently only possible on Wikidata. For popular modern artists there is currently no way to show a commons gallery of paintings in a catalog except to show a numbered series of File:Noimage.svg. Instead of deleting these regularly when they get uploaded by misinformed or unsuspecting Commonists, it would be nice if copyrighted images just default automatically to the “no image” option based on the Wikidata item information, which can be passed through to Commons.

On "modern art on Commons: I do find it logical to look for modern art on Commons, even if we insiders know it’s not there. Most people will look for a painter or sculptor name without considering copyright at all. With a true “global sign-in” to give me my customised UI, I could possibly use some structured data flag to a file held in some non-English Wikipedia, enable auto-delete for copyright files based on death date of the creator (though possibly trumped by Freedom of Panorama) all using some “No image” structured data artefact so that if the artist gets reattributed or his/her death date passes the 70 cutoff, the “undelete” would be semi-automatic and if there was no previous upload, then maybe an auto-upload link can be added to the artefact.

On "ghost uploads": I think the precision of our artist death dates is one reason we don’t have more modern art on Commons. As Commonists come and go, their “ghost uploads” for modern art slowly snowball into huge black holes, especially as exhibitions that those Commonists attended join the other “great exhibitions we all forgot ever happened”. Jane023 (talk) 10:15, 27 November 2024 (UTC)[reply]

@Jane023 Thanks, much appreciated! Sannita (WMF) (talk) 12:30, 27 November 2024 (UTC)[reply]
As one of the easiest requests to fulfil: what are the obstacles to temporarily addressing #1 to see the impact on usage and overhead? That's more a policy than a technical question. And I haven't seen any specific claims about what the negative outcomes might be. --SJ+ 14:57, 6 January 2025 (UTC)[reply]
@Sj Regarding WCQS, we are still addressing internally the issue, and I feel it might be a topic for the upcoming conversation regarding tools. The discussion is also going on on Phabricator, and I'm monitoring it closely. Sannita (WMF) (talk) 14:03, 7 January 2025 (UTC)[reply]
[edit]

Where are they?   — 🇺🇦Jeff G. please ping or talk to me🇺🇦 15:49, 12 December 2024 (UTC)[reply]

At the event on meta m:Event:Commons community discussion - 12 December 2024 16:00 UTC. GPSLeo (talk) 16:04, 12 December 2024 (UTC)[reply]
@GPSLeo: I see now under "More details", thanks. Also in my email after registration.   — 🇺🇦Jeff G. please ping or talk to me🇺🇦 17:08, 12 December 2024 (UTC)[reply]
@Jeff G. Sorry I missed the comment. For the next calls I'll be sure to make them more visible. Sannita (WMF) (talk) 17:08, 12 December 2024 (UTC)[reply]
@Sannita (WMF): Thanks! Not in the VP, though.   — 🇺🇦Jeff G. please ping or talk to me🇺🇦 17:20, 12 December 2024 (UTC)[reply]
@Jeff G. I was thinking the event page :) Sannita (WMF) (talk) 18:06, 12 December 2024 (UTC)[reply]

Thoughts on tool investment priority questions

[edit]

As I am reluctant to join a call that is using proprietary technology (there's a priority for you), I'll answer the two questions that were shared on Telegram here.

  1. . I think it makes more sense that WMF is working on core tools rather than supporting community developed tools. That being said, I think a lot of the tools created by the community should be core functionality. By that, I mean that it is not just a maintainer for the existing tool that is needed, but that the idea is brought into core and made fit natively in the existing workflow/ecosystem. Sometimes it is completely missing though, as the possibility to record and upload video in a free format should be a core functionality, but there is no clear "core" for that to add it to. In that case, helping out on the current Commons app may be the right thing (or perhaps it would be an even cleaner separation to have a separate app for it). Another possible aspect of this question is that WMF can help highlight which tools are in most dire need of active maintainers, and in general make efforts to help ease onboarding of new ones.
  2. . I would love to see a video recording/upload tool, as that seems to be something that is hard for the community to build.

Ainali (talk) 15:12, 6 January 2025 (UTC)[reply]

  • I would like that the Foundation would invest in supporting any much used tool, no matter who has created it. First of all I think of Cat-a-lot, a much used tools (at least in Commons, but perhaps also in EN-WP), with which there were problems last year and it took quite a while to solve the most part of them, because the creator was not available anymore and WMF was reluctant to help because it was not a tool made by them. I think this is one of the tools created by the community that should be core functionality. --JopkeB (talk) 15:36, 6 January 2025 (UTC)[reply]
Very much looking forward to the new 2025 calls. The ones I attended so far have been very well organized and moderated and it was really interesting to hear everyone's perspective.
  1. I think that it's preferable for WMF / Wikimedia entities to develop and maintain (often newly built) functionalities rather than taking over maintenance of community tools.
    1. It should be a long-term engagement with maintenance planned for at least the next decade.
  2. In my view/experience, metrics, batch contribution and batch import functionalities are crucial to build and maintain. Not only for the GLAM use case, but in general. Batch contribution to and batch upload of files is important not just for Wikimedians but is very valuable for anyone working with a MediaWiki wiki that includes large(r) amounts of media files. Metrics are interesting not just for partners, but to display the impact of the contributions of any Wikimedian.
  3. As I mentioned above and expressed in earlier calls, structured data should be prioritized, not categories and wikitext.
Spinster (talk) 15:45, 6 January 2025 (UTC)[reply]

I may or may not be able to make the call, but would like to suggest that for a lot of community-maintained tools, what WMF could best provide is program management and coordination of volunteers. We don't necessarily lack volunteer developers willing to maintain these tools, but the PM side of the process (keeping track of whether there are maintainers signed up for each tool; raising the flag when there is priority work; getting word out when there are changes coming that are likely to break existing tools and tracking whether someone has taken responsibility for checking each tool to make sure it is ready to deal with the change; creating a path for people to get involved in this work; etc.) is an extremely difficult task to accomplish on an unpaid volunteer basis. I suspect that if this coordination were better done, we would have a lot more volunteers to do this. - Jmabel ! talk 18:49, 6 January 2025 (UTC)[reply]

I support this suggestion of Jmabel. It would already be very helpful to know who (or where) to turn to for questions about a tool, especially when there is no answer on the talk page of a tool (including templates). JopkeB (talk) 06:53, 7 January 2025 (UTC)[reply]
One area where there could be more collaboration is with initiatives like d:Wikidata:Wiki Mentor Africa and m:Global Majority Wikimedia Technology Priorities. Wiki Mentor Africa specifically is focusing on technical skill development, and it could be useful for them to find suitable tools for long term development. I think that they also have coordinators, but in the tooling context, I think that the biggest bottleneck is the number of technically skilled mentors (i.e., persons who already know how the tools, Wikimedia infrastructure, etc. work under the hood) who can answer questions. --Zache (talk) 08:40, 7 January 2025 (UTC)[reply]

I add my two cents on a WMF-maintained video upload and conversion feature (either as a core mediawiki feature or a dedicated WMF infra/tool). I (reluctantly) adopted video2commons last year, only because I really needed it for my own project and no one stepped up when the tool was broken for several months. I managed to patch it up and it's more stable now, but still there are plenty of bugs to solve, new features to address. I lack both time and skills to maintain it properly, so I'd really love to see the WMF invest resources in this topic. vip (talk) 21:12, 8 January 2025 (UTC)[reply]

I do think part of the problem is we have a "you touch it you bought it" kind of policy. The moment you help at all, you're suddenly committed to maintaining it for the next decade. Bawolff (talk) 17:33, 1 February 2025 (UTC)[reply]
It is also a pain supporting others tools as if the individual in question does not respond your only option is to fork the tool. We at Wiki Project Med have been working to improve the Commons:Croptool so that it supports svgs and pdfs among other formats. Doc James (talk · contribs · email) 01:52, 4 February 2025 (UTC)[reply]

Thoughts on the discussion about new media and new contributors from a Latin American perspective

[edit]

Some members of Latin American user groups have gathered to discuss the two questions raised in the December 12 call, which not all of us were able to attend.

Question #1: There are periodic requests from the community for more support to contribute and edit audio and video files. At the same time, the community seems to be struggling under the current weight of uncategorized images and various patrolling backlogs. Does the community have the capacity to respond to substantially increased uploads of this media?

First of all, we want to highlight the importance that the audios and videos on Wikimedia Commons have for our communities. Much of the cultural heritage that we need to document and share is oral, musical and immaterial. Our cultures are made up of dances, languages and rituals that can hardly be documented through text and still images. Here are some examples of audio and video-based initiatives we have worked on: Category:Registro Sonoro en Lengua Quechua, Category:Laboratorio de registros sonoros 2024, Audiovisual initiative Miradas desde Bolivia

In addition, we believe that not encouraging the contribution of multimedia content would mean being left out of the consumption of content on the Internet. There is much to investigate about multimedia consumption habits inside and outside Wikimedia Commons and we think it is possible that audio and audiovisuals are more necessary as educational resources for some regions of the world than for others, just as they are for some age groups.

Therefore, we support requests for developments that help to contribute and edit audio and video files. We believe that these developments are necessary for knowledge equity and epistemic justice.

Question #2: Wikimedia Foundation has been working on improving UploadWizard, based on research done with volunteers who wanted a better interface to limit bad uploads. At the same time, other users have pushed in the opposite direction by asking for technology support that makes it easier for people to upload media. What is the right level of friction for new content uploads? Should we prioritize support for easier contribution or continue to introduce friction that reduces moderator burden?

Our answer to this question is: let's support easier contribution. If we work to share the sum of all knowledge, we need the collaboration of many, many people. We cannot fail to mention that we work in contexts where digital literacy is not the greatest, and where people are dedicating time that they don't spare to upload a photograph. Those of us who organize activities with new contributors consider that it is desirable that they have as friendly an experience as possible when they take their first steps as contributors in the movement. We think that user groups and chapters should have the possibility to choose to "absorb" the more technical work involved in uploading files to accompany these people at the beginning of their Wikimedia journey. The UploadWizard should be flexible to capture all the information that an expert user can contribute, and at the same time not impede (or be hostile) to the contribution of a newcomer.

Conclusions on both questions: We greatly value the work of the community in patrolling and categorizing the archives, and we are not unaware that the developments we are calling for are likely to increase this workload. But we think this should not necessarily be seen as a "tension" between two parties: it is more a division of labor where each party contributes what it can. Newbies need expert administrators and users so that their contributions are not lost, just as we all need those contributions from newbies to contribute to the equity of knowledge to which we aspire as a movement. In this sense, we think that, as well as developing features that make file uploads easier, it is necessary to:

  • develop tools that make the moderators tasks easier
  • support the creation of multilingual user-friendly educational resources to help newcomers learn more quickly about licenses, categories and other technical aspects of Commons
  • support the development and sustainability of tools that "gamify" or simplify the tasks of categorization, description and aggregation of structured data, so that newcomers can start contributing to these tasks sooner
  • for our part, we intend to think about activities for our campaigns and contests that help to organize the content already present in Commons, taking into account the limitations that we may encounter.

We cannot fail to mention, even if it is not the focus of the discussion, that for the Global Majority it is urgent to develop substantial improvements in mobile applications for all Wikimedia Commons, Wikipedia, and Wikidata.

Finally, we make ourselves available to collaborate in the realization of a regional Community Call for Latin America, if there is a possibility to make that happen. We believe that having instances that focus on the vision of users from different regions of the world can be key to the success of this process.

We thank you for listening, we remain at your disposal for further discussions. See you at the Community Call tomorrow! Nat (WDU) (talk) 16:24, 14 January 2025 (UTC)[reply]

The eternal false dichotomy

[edit]

In every call we are offered to discuss between to completely valid proposals, so we decide how work should be prioritized. This ia a false dichotomy, an illusion of choice pretending that we can do one or the other, this or that, but never both. This process, as the wishlist one, creates scarcity, because no one will be ever satisfied, we end up doing sub-optimal solutions instead of global improvements. We don't need scarcity, we need abundancy. And we have enough money to do that. Theklan (talk) 12:02, 27 January 2025 (UTC)[reply]

Purpose of this page

[edit]

At the top of this page:

This is not a forum for general discussion of the page’s subject.

in today's announcement on the Village Pump:

If you cannot attend the meeting, you are invited to express your point of view at any time you want on the Commons community calls talk page.

Which is it? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 12:26, 27 January 2025 (UTC)[reply]

It's definitely a mistake of the template. This is a place to express opinions on the topics of the calls. I'll fix that. Sannita (WMF) (talk) 14:19, 27 January 2025 (UTC)[reply]

Comments on impact and funding model

[edit]
Answering this call.

If we truly want Commons to contribute to the sum of all knowledge, the WMF should focus on making Commons into a modern, easy to use media repository for everyone (not just the sister projects). I'm fairly certain this is the same sentiment you will hear from everyone in the Commons community. We contribute to the project because we believe in it in its own right, not because we just want to support Wikipedia. In order for Commons to flourish, it needs several things:

  • Better tools for reviewing and flagging media submissions, including audio and video.
  • Better tools for managing and creating structured data (including better integration with UploadWizard).
  • A better search interface. Special:MediaSearch is a good start, but it has a long way to go.

One thing Commons does not need is machine assisted tagging. I honestly can't believe this is still being proposed, and it makes me question whether the WMF actually takes any community feedback seriously or is just going through the motions. Nosferattus (talk) 23:09, 27 January 2025 (UTC)[reply]

(+1) but oppose any further efforts on structured data which is redundant to categories, not used, and a time sink. Moreover, people make hasty premature conclusions about machine-assisted tagging just because one badly designed attempt failed. Not saying that should be done, but it could probably be done in a way that is working well and saving people's time will better organizing media at scale and more reliably. Prototyperspective (talk) 10:19, 2 February 2025 (UTC)[reply]
(+1) --Enyavar (talk)

The short overview of the impact funding model choices offer that Commons either gets development on the frontend (even more wikipedia integration), or development on the backend (datawarehouse capabilities). From my point of view, Commons already IS integrated well into Wikipedia and other Wikimedia projects like Source and Ktionary. I see little that is lacking on the "frontend". One thing that could be abolished (for registered users) is the MediaViewer "Middleware": It's only purpose is to prevent users to click on an image and directly be routed onto Wikimedia Commons. Registered users don't need that hurdle (or they can opt-in if they for some reason do).
BUT IF we have to choose for Commons to get additional capabilities, then I'd prefer that its capabilities as a multimedia data repository get development, here on the backend.
Now, there are very different media types, and not all should be handled the same. One such type are regular photographs, of course, and I'll let others worry about those, because photos are about a myriad of things which all need different structured data entries. I'm most concerned with two special media types:

  • book scans (we have millions of fully scanned books on Commons), and they are pretty immune to structured data (while libraries have since centuries been able to catalogue books). My top on the wish list: All media that is determined to be "book" (pdf/djvu), gets a few SD-statements assigned as preselected standard fields. Here is a random example I grabbed off the digital shelf): Publishing place: [London]. Publisher: [Jarrold & Sons]. Publication date: [1895]. Title: [Cratfield Parish Papers]. Author: [John James Raven] & [William Holland]. Digitizer: [archive.org] (?).
    Currently, an editor who is potentially inclined to fill in structured data for this, would have to navigate to the "structured data" tab and search manually among all the hundreds of thousands of available SD-statements, manually identify each of the correct statement names, manually apply them to the file, and manually fill them. Despite some software assistance in doing it, I noticed nobody doing that. Uploading institutions always migrate their metadata and structured data into the file description; and their uploads make up most of our quality content in the field of books. Tasked with extracting structured data from descriptions, bots could do well over 80% of the cataloguing work; and only in this segment do I see any potential for tightly-controlled machine learning/assistance: We have a huge mass of work, it is highly structured work, and it is a work that humans are unwilling to do. Special attention needs to be placed on the hundredthousands of duplicates from different institutions, and on distinguishing different editions of books with the same title and content. If this was done correctly, it would instantly open all these books to better search access on both Wikidata and Commons, and also facilitate categorization efforts in Commons.
  • map scans (we have also have millions those here on Commons) and maps in general. Like books, old maps should get a structured data template. Again, a random example: Depicted (full map extent): [Paris (city)]. Title: [Plan de Paris]. Scale(s): ~[1:1,800]. Language(s): [French]. Digitizer(s): [Norman B. Leventhal Map Center]. Type(s): [Index map] & [City map]. Part of: [Turgot map of Paris]. Orientation(s): [North] (=default). Creator(s): [Louis Bretez] & [Claude Lucas] & [Aubin]. Source: [Scan of original publication]. Publication date: [1739]. Publisher: [Michel Etienne Turgot]. Year depicted: [1730s] (=contemporary/default)
    Like with books, most digitizing institutions have brought standard attributes into the file descriptions, and bots could extract most of those information and convert it into structured data. The same standard data input model also largely applies to digital user-created maps.

so this is my input, hope something comes out of it. Similar standard data entry models could also be applied to "portrait images" and "images of buildings", but I would imagine that there is a higher variation with those. In any case, my suggestion is that efforts are placed more humbly on "closed-controlled automatization" to first bring the metadata up towards a level where "machine learning" can later take off from; and that we first try to improve value on existing data before introducing new user interfaces. --Enyavar (talk) 17:42, 29 January 2025 (UTC)[reply]


...

Input for the fourth call (February 5)

[edit]

I'm unable to join the February 5 calls due to work obligations; adding my feedback here.

  • I fully agree with other participants who have expressed that the dichotomies we are presented with are highly problematic. It's the duty of organizations who serve our movement and strategy to support the Wikimedia community achieve this strategy. A mindset of scarcity ("we have to choose") is divisive, and inappropriate for that. In times of worldwide political turmoil, we need communities and organizations that think big and inspire.
  • Similarly, the expression "we would not invest in both" is also problematic in so many ways. What right does the Wikimedia Foundation have to position itself like this, introduce the concept of a "we", introduce the concept of "investing"? If there's a we, it is the Wikimedia community, its content partners, and the millions of people who consult upon our collectively collected and curated knowledge. If there's investment, it lies with the millions of hours that Wikimedians, cultural heritage institutions and other partners have put into digitizing, photographing, uploading and curating.
  • That said, I work with larger budgets myself; I do see and agree that our movement works on a shoestring and that the Wikimedia movement does need more financial means to do more. We should continue working as nimbly as possible, however.
    • I've been working on online knowledge platforms for 25 years now. One strong pattern I see in funding such platforms: temporary external project funding (e.g. subsidies, grants) is problematic because it provides a temporary fix, but will not be continued (that's the nature of it). I can't count the number of abandoned projects and platforms I've seen in my time, that were once created through such funding.
      • Additionally, in the current political climate, external funding is bound to become even more precarious (recent example of the NSF), and institutions we see as reliable are also bound to become less so, and more precarious.
    • Instead, we should aim for autonomy as much as possible. The Wikimedia movement's donations-based funding has been consistent and in strong alignment with our mission (a commons supported by people around the world; if you can't contribute knowledge, you can contribute a bit financially). I am all for increasing annual donations, and also explicitly asking for donations while spreading the word and educating about our other platforms, including Wikimedia Commons. Let's also use opportunities like the "Wokepedia incident" to collect and welcome extra funds.
  • Wikimedia Commons has always been a good in its own right. It should be further developed and supported as a general free media repository, a repository to autonomously browse, explore and learn through audiovisual knowledge, which Wikipedia is not offering. A well-developed autonomous Wikimedia Commons will finally provide our movement with a suitable platform to share knowledge that is not, and may never be, properly describable in and/or notable for Wikipedia.
  • There are natural partners for Wikimedia Commons that we could collaborate more strongly with: Internet Archive (the media repository part), Flickr Commons, and various other usually donation-funded non-profit platforms that collect and share free media as a commons as part of thematic work (e.g. iNaturalist). I would love to see us work with them more explicitly and formally, to share infrastructure and resources and integrate with each other where that makes sense. This will strengthen the commons in general.

Spinster (talk) 08:25, 2 February 2025 (UTC)[reply]

Seconding this, and also probably not available for the call. - Jmabel ! talk 18:01, 2 February 2025 (UTC)[reply]
@Spinster Thanks, I'll be sure to report this also during the call to further the discussion. Sannita (WMF) (talk) 15:10, 3 February 2025 (UTC)[reply]