The Blog of Someone Who Builds Things on the Internet

I google how to do things, then post the answers here cause I'll probably forget. Maybe someone else finds it useful.

Creating a ZIP file including a CSV summary of contents with Django

Published March 16, 2017

So I was building an "apply to work here" application for the company I work for, and I was using Django as my framework. My company has a bunch of entry level/part time jobs we post, so people would usually want to apply for more than one at a time, meaning on the apply page there is a list of jobs, and you check off one or more positions you want to be considered for. You then have to upload one resume and one or more cover letters depending on the job you applied for.

From the Admin side, I wanted managers to be able to login and download a zip file that had all the resumes and cover letters for applicants in their area, and I also wanted to include a CSV file in that zip download with an overview of everyone and the various fields they fill out when submitting their application. I built this as a action for the list view in the admin, but the core functionally (make a CSV from a DB and add it to a zip file) can be used in any situation

The first part of this is the Admin action. I'm not really going to get into this you can read the Django reference here:

# admin.py
from django.contrib import admin
from .models import Applicant

class ApplicantAdmin(admin.ModelAdmin):
    actions = ['download_applications']

    def download_applications(self, request, queryset):
        # THE MAKE A CSV AND ZIP CODE WILL GO HERE

    download_applications.short_description = "Download the selected applications"

admin.site.register(Applicant, ApplicantAdmin)

Now you have the basics for the admin file.

Getting started with the actual function, first you need to import a few libraries

from django.conf import settings
import zipfile, os, datetime, io, csv

Most of those are self explanatory. You don't need datetime, but I use it to give the final file a name indicating when it was downloaded. A library that was new for me was IO. I'm still not super clear on it, as I'm not really familiar with the term stream in the context that seems to get used a lot around it, but it seams to boil down to creating files in memory and adding to them over the lifetime of the current program run. If I'm totally wrong about that, my bad internet. The reference doc for IO are here.

After the libraries are imported, the rest of this code goes in the admin.py file as indicated by the comment above. First is to create the CSV file:

# Create the empty csv
csv_file = io.StringIO()
writer = csv.writer(csv_file)

# Get the field names
opts = self.model._meta
field_names = list([field.name for field in opts.fields])

# Write the header row for the csv file
writer.writerow(field_names)

# Loop each row in the db and output to the file
for obj in queryset:
    # Write the information to the csv file
    writer.writerow([str(getattr(obj, field)).encode("utf-8","replace") for field in field_names])

csv_file.seek(0)

Pretty basic make a CSV file. Only thing new for me was the csv_file.seek(0) at the bottom there. Basiclly you want to get back to the beginning of the data in the "stream" so you can dump it out later.

Next up, zip file. First thing is you need a list of your files. In my models, since applicants can have 2+ files, I have another model, File, which Applicant is in a One-to-Many relationship with. For me I expanded my for loop in the above code to also get the files for each applicant:

# Loop each row in the db and output to the file
files = []
for obj in queryset:
    # Write the information to the csv file
    writer.writerow([str(getattr(obj, field)) for field in field_names])
    # Get the list of files attached to this application and save the name for later
    for file in list(obj.file_set.all()):
        files.append(settings.MEDIA_ROOT+file.file.name)

Now that you have a list of the files, you can start on the zip file:

# Start a zip file up
zip_subdir = "applications_"+str(datetime.datetime.now().strftime('%Y%m%d%H%M%S')) # name of the zip file to be downlaoded
zip_filename = "%s.zip" % zip_subdir #just adding .zip to the filename
s = io.BytesIO() # Open StringIO to grab in-memory ZIP contents
zf = zipfile.ZipFile(s, 'w') # The zip compressor

# Add the files to the zip file
for fpath in files:
    fdir, fname = os.path.split(fpath) # split the file name from the path the file is stored on your server
    zf.write(fpath, fname) # Add file to the zip file

# Add the csv to the zip file
zf.writestr("_applicants.csv", csv_file.getvalue())

# Must close zip for all contents to be written
zf.close()

Alright, a lot going on here. I think I've done a good job of commenting the code, so you should be able to follow along. The one thing that threw me for a loop: zf.write(fpath, fname). The second parameter is the name the file is called when it is saved in the zip file. What had me cursing was a lot of tutorials out there have fname = os.path.join(zip_subdir, fname), which is fine, if we weren't adding the csv file in with zf.writestr(). What was happening to me was I was ending up with a zip file that had the CSV in it, and a folder with the same name as the zip file and all the PDFs in that folder. In summary, if you have fname = 'x/file.pdf' in the zip file you will get a folder called x with file.pdf inside it, unless all files in the archive are added with the same x in front of them, then the archive will be called x. I actually ended up using this to my advantage, and in production on my app, I have a zip file downloaded with a CSV and a folder for each applicant with their resumes and cover letters, so it was a good learning experience.

Final thing to do in you download_applications() function, return the request to the user:

resp = HttpResponse(s.getvalue(), content_type = "application/x-zip-compressed") # Grab ZIP file from in-memory, make response with correct MIME-type
resp['Content-Disposition'] = 'attachment; filename=%s' % zip_filename # ..and correct content-disposition
return resp

Alright, that is how to create a zip file in Django with a CSV summary in it!

Full Code below (including my models.py for reference):

# admin.py
from django.contrib import admin
from django.http import HttpResponse
from django.conf import settings
import zipfile, os, datetime, io, csv

from .models import Applicant

class ApplicantAdmin(admin.ModelAdmin):
    actions = ['download_applications']

    def download_applications(self, request, queryset):
        # Create the empty csv
        csv_file = io.StringIO()
        writer = csv.writer(csv_file)

        # Get the field names
        opts = self.model._meta
        field_names = list([field.name for field in opts.fields])

        # Write the header row for the csv file
        writer.writerow(field_names)

        # List to save the file names to
        files = []

        # Loop each row in the db and output to the file
        for obj in queryset:
            # Write the information to the csv file
            writer.writerow([str(getattr(obj, field)) for field in field_names])
            # Get the list of files attached to this application and save the name for later
            for file in list(obj.file_set.all()):
                files.append(settings.MEDIA_ROOT+file.file.name)

        csv_file.seek(0)

        # Start a zip file up
        zip_subdir = "applications_"+str(datetime.datetime.now().strftime('%Y%m%d%H%M%S')) # name of the zip file to be downlaoded
        zip_filename = "%s.zip" % zip_subdir #just adding .zip to the filename
        s = io.BytesIO() # Open StringIO to grab in-memory ZIP contents
        zf = zipfile.ZipFile(s, 'w') # The zip compressor

        # Add the files to the zip file
        for fpath in files:
            fdir, fname = os.path.split(fpath) # split the file name from the path the file is stored on your server
            zf.write(fpath, fname) # Add file to the zip file

        # Add the csv to the zip file
        zf.writestr("_applicants.csv", csv_file.getvalue())

        # Must close zip for all contents to be written
        zf.close()

        resp = HttpResponse(s.getvalue(), content_type = "application/x-zip-compressed") # Grab ZIP file from in-memory, make response with correct MIME-type
        resp['Content-Disposition'] = 'attachment; filename=%s' % zip_filename # ..and correct content-disposition
        return resp

download_applications.short_description = "Download the selected applications"

admin.site.register(Applicant, ApplicantAdmin)
# models.py
from django.db import models
class Applicant(models.Model):
    fname = models.CharField()
    # a bunch more attributes

class File(models.Model):
    file = models.FileField(upload_to='applications/%Y/%m/')
    applicant = models.ForeignKey(Applicant, on_delete=models.CASCADE)