How to change GAN example to generate images with a larger size?
조회 수: 11 (최근 30일)
이전 댓글 표시
How can I change the original GAN example (https://www.mathworks.com/help/deeplearning/ug/train-generative-adversarial-network.html) to generate images with a bigger size, e.g., 128*128.
The example works with 64*64 colored images and thus produces low resolution images. I guess this size was choosen to shorten the training time.
The images are augmented before feeding them into the generator:
augimds = augmentedImageDatastore([64 64],imds,'DataAugmentation',augmenter);
The generator contains 4 transposed convolutional layers and the discriminator contains 5 convoultional layers. Both generator and discriminator use a number of 64 filters.
I edited the code by changing the image size of the augmids and add an additional transposed convolutional layer (with modified parameters) to the generator and modifiy discriminator code accordingly. Code is shown below:
% The generator
filterSize = 5;
numFilters = 128;
numLatentInputs = 100;
projectionSize = [4 4 512];
layersGenerator = [
imageInputLayer([1 1 numLatentInputs],'Normalization','none','Name','in')
projectAndReshapeLayer(projectionSize,numLatentInputs,'proj');
transposedConv2dLayer(filterSize,8*numFilters,'Name','tconv1')
batchNormalizationLayer('Name','bnorm1')
reluLayer('Name','relu1')
transposedConv2dLayer(filterSize,4*numFilters,'Stride',1,'Cropping','same','Name','tconv2')
batchNormalizationLayer('Name','bnorm2')
reluLayer('Name','relu2')
transposedConv2dLayer(filterSize,2*numFilters,'Stride',2,'Cropping','same','Name','tconv3')
batchNormalizationLayer('Name','bnorm3')
reluLayer('Name','relu3')
transposedConv2dLayer(filterSize,numFilters,'Stride',2,'Cropping','same','Name','tconv4')
batchNormalizationLayer('Name','bnorm4')
reluLayer('Name','relu4')
transposedConv2dLayer(filterSize,3,'Stride',2,'Cropping','same','Name','tconv5')
tanhLayer('Name','tanh')];
% The discriminator
dropoutProb = 0.5;
numFilters = 128;
scale = 0.2;
inputSize = [128 128 3];
filterSize = 5;
layersDiscriminator = [
imageInputLayer(inputSize,'Normalization','none','Name','in')
dropoutLayer(0.5,'Name','dropout')
convolution2dLayer(filterSize,numFilters,'Stride',2,'Padding','same','Name','conv1')
leakyReluLayer(scale,'Name','lrelu1')
convolution2dLayer(filterSize,2*numFilters,'Stride',2,'Padding','same','Name','conv2')
batchNormalizationLayer('Name','bn2')
leakyReluLayer(scale,'Name','lrelu2')
convolution2dLayer(filterSize,4*numFilters,'Stride',2,'Padding','same','Name','conv3')
batchNormalizationLayer('Name','bn3')
leakyReluLayer(scale,'Name','lrelu3')
convolution2dLayer(filterSize,8*numFilters,'Stride',2,'Padding','same','Name','conv4')
batchNormalizationLayer('Name','bn4')
leakyReluLayer(scale,'Name','lrelu4')
convolution2dLayer(4,1,'Name','conv5')];
lgraphDiscriminator = layerGraph(layersDiscriminator);
But the modifications produced an error in this line of code:
[gradientsGenerator, gradientsDiscriminator, stateGenerator, scoreGenerator, scoreDiscriminator] = ...
dlfeval(@modelGradients, dlnetGenerator, dlnetDiscriminator, dlX, dlZ, flipFactor);
Specificllay in the dlfeval:
Error using dlfeval (line 43)
Value to differentiate must be a traced dlarray scalar.
I am trying to figure out the raltionship between the image size and other parameters: num of fiters, number of layers, ... so I can modify them to generate images with different sizes other than 64*64.
Thanks
댓글 수: 1
Anthony Herdman
2020년 9월 23일
Tarunbir indicated that you need to change the stider in 'conv2' to '2' but I believe you also need to change a conv layer for the Discriminator.
THe following solved it for me (see bolded line below). For the Discriminator, I changed the 'Stride' from "2" to "4" on one of the convolution networks (in this case 'conv3') to get "Activations" of 1x1x3 in the conv5. If you don't do this, the "Activations" in conv5 will be 5x5x3 which will lead to a 5x5x1xM predictions when calling the function "ModelGradients.m" (see line "dlYPred = forward(dlnetDiscriminator, dlX);"
Hope this works for you.
layersDiscriminator = [
imageInputLayer(inputSize,'Normalization','none','Name','in')
dropoutLayer(0.5,'Name','dropout')
convolution2dLayer(filterSize,numFilters,'Stride',2,'Padding','same','Name','conv1')
leakyReluLayer(scale,'Name','lrelu1')
convolution2dLayer(filterSize,2*numFilters,'Stride',2,'Padding','same','Name','conv2')
batchNormalizationLayer('Name','bn2')
leakyReluLayer(scale,'Name','lrelu2')
convolution2dLayer(filterSize,4*numFilters,'Stride',4,'Padding','same','Name','conv3')
batchNormalizationLayer('Name','bn3')
leakyReluLayer(scale,'Name','lrelu3')
convolution2dLayer(filterSize,8*numFilters,'Stride',2,'Padding','same','Name','conv4')
batchNormalizationLayer('Name','bn4')
leakyReluLayer(scale,'Name','lrelu4')
convolution2dLayer(4,1,'Name','conv5')];
답변 (5개)
Tarunbir Gambhir
2020년 9월 3일
Generative Adversarial Networks consists of Generator and Discriminator networks that train together to generate data with characteristics of the real data. If the size of the real data is changed, both the networks need to be altered to accommodate this change.
The output of the Generator network needs to be of the same size as that of the real images. For your case, which is image of size 128 x 128 x 3, the layer "tconv2" of the Generator network should have the following specifications:
transposedConv2dLayer(filterSize,4*numFilters,'Stride',2,'Cropping','same','Name','tconv2')
Explanation: After the "proj" layer, data has the shape of 4 x 4 x 512. After the "tconv1" layer, data has the shape of 8 x 8 x 1024. You can lookup how to calculate the output size of a transposed convolution layer. For the further 4 layers, the output size needs to double at every layer in order to get the output shape of 128 x 128 x 3. Keeping “Cropping” as “same” ensures that the output size equals inputSize .* Stride, refer Transposed Convolution Layer.
After this you need to ensure that the Discriminator network should output a single value for every input image of size 128 x 128 x 3. This is because the Discriminator model outputs probabilities after the sigmoid function for every datapoint (refer here). For your case, the layer "conv5" should have the following specifications:
convolution2dLayer(8,1,'Name','conv5')
Explanation: You can go through the MATLAB documentation on 2D Convolution layers to understand how the kernel size affects the output size of that layer.
For debugging, I suggest you run the following script to ensure that the Generator and Discriminator networks gives the output of correct size.
[outputY,~] = forward(dlnetMODEL,inputX);
disp(size(outputY));
Note: The required results can be obtained by modifying any of the "conv"/"tconv" layers or adding more "conv" layers or by adding a global average pooling layer at the end of the network. Although they should have the correct parameters.
댓글 수: 2
Whussa
2020년 9월 17일
Has someone solved this yet? I tried to implement Tarunbir´s answer but still get the same error.
This is my code (general GAN example):
datasetFolder = fullfile('/Users/bilder gan');
imds = imageDatastore(datasetFolder, ...
'IncludeSubfolders',true);
augmenter = imageDataAugmenter('RandXReflection',false);
augimds = augmentedImageDatastore([128 128],imds,'DataAugmentation',augmenter);
%%
filterSize = 5;
numFilters = 128;
numLatentInputs = 100;
projectionSize = [4 4 512];
layersGenerator = [
imageInputLayer([1 1 numLatentInputs],'Normalization','none','Name','in')
projectAndReshapeLayer(projectionSize,numLatentInputs,'proj');
transposedConv2dLayer(filterSize,8*numFilters,'Name','tconv1')
batchNormalizationLayer('Name','bnorm1')
reluLayer('Name','relu1')
transposedConv2dLayer(filterSize,4*numFilters,'Stride',1,'Cropping','same','Name','tconv2')
batchNormalizationLayer('Name','bnorm2')
reluLayer('Name','relu2')
transposedConv2dLayer(filterSize,2*numFilters,'Stride',2,'Cropping','same','Name','tconv3')
batchNormalizationLayer('Name','bnorm3')
reluLayer('Name','relu3')
transposedConv2dLayer(filterSize,numFilters,'Stride',2,'Cropping','same','Name','tconv4')
batchNormalizationLayer('Name','bnorm4')
reluLayer('Name','relu4')
transposedConv2dLayer(filterSize,3,'Stride',2,'Cropping','same','Name','tconv5')
tanhLayer('Name','tanh')];
lgraphGenerator = layerGraph(layersGenerator);
%%
dlnetGenerator = dlnetwork(lgraphGenerator);
%%
dropoutProb = 0.5;
numFilters = 128;
scale = 0.2;
inputSize = [128 128 3];
filterSize = 5;
layersDiscriminator = [
imageInputLayer(inputSize,'Normalization','none','Name','in')
dropoutLayer(0.5,'Name','dropout')
convolution2dLayer(filterSize,numFilters,'Stride',2,'Padding','same','Name','conv1')
leakyReluLayer(scale,'Name','lrelu1')
convolution2dLayer(filterSize,2*numFilters,'Stride',2,'Padding','same','Name','conv2')
batchNormalizationLayer('Name','bn2')
leakyReluLayer(scale,'Name','lrelu2')
convolution2dLayer(filterSize,4*numFilters,'Stride',2,'Padding','same','Name','conv3')
batchNormalizationLayer('Name','bn3')
leakyReluLayer(scale,'Name','lrelu3')
convolution2dLayer(filterSize,8*numFilters,'Stride',2,'Padding','same','Name','conv4')
batchNormalizationLayer('Name','bn4')
leakyReluLayer(scale,'Name','lrelu4')
convolution2dLayer(8,1,'Name','conv5')];
lgraphDiscriminator = layerGraph(layersDiscriminator);
%%
dlnetDiscriminator = dlnetwork(lgraphDiscriminator);
%%
numEpochs = 500;
miniBatchSize = 128;
augimds.MiniBatchSize = miniBatchSize;
%%
learnRate = 0.0002;
gradientDecayFactor = 0.5;
squaredGradientDecayFactor = 0.999;
%%
executionEnvironment = "auto";
%%
flipFactor = 0.3;
%%
validationFrequency = 100;
%% TRAINING
trailingAvgGenerator = [];
trailingAvgSqGenerator = [];
trailingAvgDiscriminator = [];
trailingAvgSqDiscriminator = [];
%%
numValidationImages = 25;
ZValidation = randn(1,1,numLatentInputs,numValidationImages,'single');
%%
dlZValidation = dlarray(ZValidation,'SSCB');
%%
f = figure;
f.Position(3) = 2*f.Position(3);
%%
imageAxes = subplot(1,2,1);
scoreAxes = subplot(1,2,2);
%%
lineScoreGenerator = animatedline(scoreAxes,'Color',[0 0.447 0.741]);
lineScoreDiscriminator = animatedline(scoreAxes, 'Color', [0.85 0.325 0.098]);
legend('Generator','Discriminator');
ylim([0 1])
xlabel("Iteration")
ylabel("Score")
grid on
%%
iteration = 0;
start = tic;
% Loop over epochs.
for epoch = 1:numEpochs
% Reset and shuffle datastore.
reset(augimds);
augimds = shuffle(augimds);
% Loop over mini-batches.
while hasdata(augimds)
iteration = iteration + 1;
% Read mini-batch of data.
data = read(augimds);
% Ignore last partial mini-batch of epoch.
if size(data,1) < miniBatchSize
continue
end
% Concatenate mini-batch of data and generate latent inputs for the
% generator network.
X = cat(4,data{:,1}{:});
X = single(X);
Z = randn(1,1,numLatentInputs,size(X,4),'single');
% Rescale the images in the range [-1 1].
X = rescale(X,-1,1,'InputMin',0,'InputMax',255);
% Convert mini-batch of data to dlarray and specify the dimension labels
% 'SSCB' (spatial, spatial, channel, batch).
dlX = dlarray(X, 'SSCB');
dlZ = dlarray(Z, 'SSCB');
% If training on a GPU, then convert data to gpuArray.
if (executionEnvironment == "auto" && canUseGPU) || executionEnvironment == "gpu"
dlX = gpuArray(dlX);
dlZ = gpuArray(dlZ);
end
% Evaluate the model gradients and the generator state using
% dlfeval and the modelGradients function listed at the end of the
% example.
[gradientsGenerator, gradientsDiscriminator, stateGenerator, scoreGenerator, scoreDiscriminator] = ...
dlfeval(@modelGradients, dlnetGenerator, dlnetDiscriminator, dlX, dlZ, flipFactor);
dlnetGenerator.State = stateGenerator;
% Update the discriminator network parameters.
[dlnetDiscriminator,trailingAvgDiscriminator,trailingAvgSqDiscriminator] = ...
adamupdate(dlnetDiscriminator, gradientsDiscriminator, ...
trailingAvgDiscriminator, trailingAvgSqDiscriminator, iteration, ...
learnRate, gradientDecayFactor, squaredGradientDecayFactor);
% Update the generator network parameters.
[dlnetGenerator,trailingAvgGenerator,trailingAvgSqGenerator] = ...
adamupdate(dlnetGenerator, gradientsGenerator, ...
trailingAvgGenerator, trailingAvgSqGenerator, iteration, ...
learnRate, gradientDecayFactor, squaredGradientDecayFactor);
% Every validationFrequency iterations, display batch of generated images using the
% held-out generator input
if mod(iteration,validationFrequency) == 0 || iteration == 1
% Generate images using the held-out generator input.
dlXGeneratedValidation = predict(dlnetGenerator,dlZValidation);
% Tile and rescale the images in the range [0 1].
I = imtile(extractdata(dlXGeneratedValidation));
I = rescale(I);
% Display the images.
subplot(1,2,1);
image(imageAxes,I)
xticklabels([]);
yticklabels([]);
title("Generated Images");
end
% Update the scores plot
subplot(1,2,2)
addpoints(lineScoreGenerator,iteration,...
double(gather(extractdata(scoreGenerator))));
addpoints(lineScoreDiscriminator,iteration,...
double(gather(extractdata(scoreDiscriminator))));
% Update the title with training progress information.
D = duration(0,0,toc(start),'Format','hh:mm:ss');
title(...
"Epoch: " + epoch + ", " + ...
"Iteration: " + iteration + ", " + ...
"Elapsed: " + string(D))
drawnow
end
end
%% generate new images
ZNew = randn(1,1,numLatentInputs,25,'single');
dlZNew = dlarray(ZNew,'SSCB');
%%
dlXGeneratedNew = predict(thirdTryNet,dlZNew);
%%
I = imtile(extractdata(dlXGeneratedNew));
I = rescale(I);
figure
image(I)
axis off
title("Generated Images")
%%
thirdTryNet=dlnetGenerator;
save thirdTryNet
Stavros
2022년 7월 6일
I achived to produce 128x128 images but what about bigger images such as 512x512?
댓글 수: 2
Ziqi Sun
2022년 10월 17일
Hey man, can you share the code. I follow the code above but I got negative training variance error...
Cecilia Di Ruberto
2022년 11월 2일
Hi, I got the same error, "Expected TrainedVariance" to be positive. I followed all the suggestions.
Please, if you produced the bigger images can you share the code? Thanks in advance
Suhail Mahmud
2022년 11월 8일
I was able to generate 128 by 128 pixel image by using the following code:
augimds = augmentedImageDatastore([128 128],imds,DataAugmentation=augmenter,ColorPreprocessing="gray2rgb");
%% This is the Generator
filterSize = 5;
numFilters = 128;
numLatentInputs = 100;
projectionSize = [4 4 512];
layersGenerator = [
featureInputLayer(numLatentInputs)
projectAndReshapeLayer(projectionSize)
transposedConv2dLayer(filterSize,8*numFilters)
batchNormalizationLayer
reluLayer
transposedConv2dLayer(filterSize,4*numFilters,Stride=2,Cropping="same")
batchNormalizationLayer
reluLayer
transposedConv2dLayer(filterSize,2*numFilters,Stride=2,Cropping="same")
batchNormalizationLayer
reluLayer
transposedConv2dLayer(filterSize,numFilters,Stride=2,Cropping="same")
batchNormalizationLayer
reluLayer
transposedConv2dLayer(filterSize,3,Stride=2,Cropping="same")
tanhLayer];
netG = dlnetwork(layersGenerator);
%% This is the Discrimanator
dropoutProb = 0.75;
numFilters = 64;
scale = 0.2;
inputSize = [128 128 3];
filterSize = 5;
layersDiscriminator = [
imageInputLayer(inputSize,Normalization="none")
dropoutLayer(dropoutProb)
convolution2dLayer(filterSize,numFilters,Stride=2,Padding="same")
leakyReluLayer(scale)
convolution2dLayer(filterSize,2*numFilters,Stride=2,Padding="same")
batchNormalizationLayer
leakyReluLayer(scale)
convolution2dLayer(filterSize,4*numFilters,Stride=2,Padding="same")
batchNormalizationLayer
leakyReluLayer(scale)
convolution2dLayer(filterSize,8*numFilters,Stride=2,Padding="same")
batchNormalizationLayer
leakyReluLayer(scale)
convolution2dLayer(8,1)
sigmoidLayer];
netD = dlnetwork(layersDiscriminator);
All the remaining part of the code will be same as the example of GAN Example. Just make sure you have a good computational resource to run the code. Best of Luck.
댓글 수: 1
Fred Liu
2022년 11월 11일
You can try the following code, hope it will help.
I also thank the previous contribution code, but unfortunately I saw it later, and I will also implement it.
Generator
filterSize = 5;
numFilters = 128;
numLatentInputs = 100;
projectionSize = [4 4 512];
layersGenerator = [
featureInputLayer(numLatentInputs)
projectAndReshapeLayer(projectionSize)
transposedConv2dLayer(filterSize,8*numFilters)
batchNormalizationLayer
reluLayer
transposedConv2dLayer(filterSize,4*numFilters,Stride=2,Cropping="same")
batchNormalizationLayer
reluLayer
transposedConv2dLayer(filterSize,2*numFilters,Stride=2,Cropping="same")
batchNormalizationLayer
reluLayer
transposedConv2dLayer(filterSize,numFilters,Stride=2,Cropping="same")
batchNormalizationLayer
reluLayer
transposedConv2dLayer(filterSize,3,Stride=2,Cropping="same")
tanhLayer];
Discriminator
dropoutProb = 0.5;
numFilters = 128;
scale = 0.2;
inputSize = [128 128 3];
filterSize = 5;
layersDiscriminator = [
imageInputLayer(inputSize,Normalization="none")
dropoutLayer(dropoutProb)
convolution2dLayer(filterSize,numFilters,Stride=2,Padding="same")
leakyReluLayer(scale)
dropoutLayer(dropoutProb)
convolution2dLayer(filterSize,2*numFilters,Stride=2,Padding="same")
batchNormalizationLayer
leakyReluLayer(scale)
dropoutLayer(dropoutProb)
convolution2dLayer(filterSize,4*numFilters,Stride=2,Padding="same")
batchNormalizationLayer
leakyReluLayer(scale)
dropoutLayer(dropoutProb)
convolution2dLayer(filterSize,8*numFilters,Stride=2,Padding="same")
batchNormalizationLayer
leakyReluLayer(scale)
convolution2dLayer(8,1)
sigmoidLayer];
Training Options
learnRate = 0.0001;
gradientDecayFactor = 0.5;
squaredGradientDecayFactor = 0.999;
댓글 수: 1
Jonathan
2022년 11월 11일
We show a few low color images to see what we are training our models.
# Display 10 real images
fig, axs = plt.subplots(2, 5, sharey=False, tight_layout=True, figsize=(16,9), facecolor='white')
n=0
for i in range(0,2):
for j in range(0,5):
axs[i,j].matshow(data_lowres[n])
n=n+1
plt.show()
댓글 수: 0
참고 항목
카테고리
Help Center 및 File Exchange에서 Image Data Workflows에 대해 자세히 알아보기
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!