this post was submitted on 11 Dec 2023
7 points (88.9% liked)
Python
6368 readers
3 users here now
Welcome to the Python community on the programming.dev Lemmy instance!
π Events
Past
November 2023
- PyCon Ireland 2023, 11-12th
- PyData Tel Aviv 2023 14th
October 2023
- PyConES Canarias 2023, 6-8th
- DjangoCon US 2023, 16-20th (!django π¬)
July 2023
- PyDelhi Meetup, 2nd
- PyCon Israel, 4-5th
- DFW Pythoneers, 6th
- Django Girls Abraka, 6-7th
- SciPy 2023 10-16th, Austin
- IndyPy, 11th
- Leipzig Python User Group, 11th
- Austin Python, 12th
- EuroPython 2023, 17-23rd
- Austin Python: Evening of Coding, 18th
- PyHEP.dev 2023 - "Python in HEP" Developer's Workshop, 25th
August 2023
- PyLadies Dublin, 15th
- EuroSciPy 2023, 14-18th
September 2023
- PyData Amsterdam, 14-16th
- PyCon UK, 22nd - 25th
π Python project:
- Python
- Documentation
- News & Blog
- Python Planet blog aggregator
π Python Community:
- #python IRC for general questions
- #python-dev IRC for CPython developers
- PySlackers Slack channel
- Python Discord server
- Python Weekly newsletters
- Mailing lists
- Forum
β¨ Python Ecosystem:
π Fediverse
Communities
- #python on Mastodon
- c/django on programming.dev
- c/pythorhead on lemmy.dbzer0.com
Projects
- PythΓΆrhead: a Python library for interacting with Lemmy
- Plemmy: a Python package for accessing the Lemmy API
- pylemmy pylemmy enables simple access to Lemmy's API with Python
- mastodon.py, a Python wrapper for the Mastodon API
Feeds
founded 1 year ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
In that case you can iterate over the rows instead of using apply()
Test it out and see if it's more efficient.
Also, you can improve performance by only passing the required columns to apply()
Actually this seems like a better solution for you.
Here's another approach, I like this one more because it is a closer match to the problem you described.
Check the
result_type=expand
argument for df.apply()Thanks, tried the first approach but was slower that what I was doing. The second one didn't worked because I use some of the new generated columns to create new ones, but doing the process twice, to use the new columns to create the additional columns worked well and reduced the process time from 22m to 13m. Maybe they're ways to optimize even more the code, but 13 minutes is good enough for me.
Edit: for some reason it broke the information in some way and the next steps of the process are giving me errors π
Edit2: I'm an idiot, I made an error while updating the code to the new method.